Detection of nucleic acid reactions on bead arrays

ABSTRACT

The present invention is directed to methods and compositions for the use of microsphere arrays to detect and quantify a number of nucleic acid reactions. The invention finds use in genotyping, i.e. the determination of the sequence of nucleic acids, particularly alterations such as nucleotide substitutions (mismatches) and single nucleotide polymorphisms (SNPs). Similarly, the invention finds use in the detection and quantification of a nucleic acid target using a variety of amplification techniques, including both signal amplification and target amplification. The methods and compositions of the invention can be used in nucleic acid sequencing reactions as well. All applications can include the use of adapter sequences to allow for universal arrays.

This application is a continuation of U.S. application Ser. No.14/322,529 filed Jul. 2, 2014 which is a continuation of U.S.application Ser. No. 13/913,922, filed Jun. 10, 2013 which issued asU.S. Pat. No. 9,279,148, which is a continuation of U.S. applicationSer. No. 12/212,585, filed Sep. 17, 2008, which issued as U.S. Pat. No.8,486,625, which is a continuation of U.S. application Ser. No.11/238,826, filed Sep. 28, 2005, now abandoned, which is a continuationof U.S. application Ser. No. 09/553,993, filed Apr. 20, 2000, nowabandoned, which is a continuation of U.S. application Ser. No.09/535,854, filed Mar. 27, 2000, now abandoned, which is acontinuation-in-part of U.S. application Ser. No. 09/517,945, filed Mar.3, 2000, which issued as U.S. Pat. No. 6,355,431, which is acontinuation-in-part of U.S. application Ser. No. 09/513,362, filed Feb.25, 2000, now abandoned, which is a continuation-in-part of U.S.application Ser. No. 09/425,633, filed Oct. 22, 1999, now abandoned,which is based on, and claims the benefit of, U.S. Prov. App. Nos.60/161,148, filed Oct. 22, 1999; 60/160,927, filed Oct. 22, 1999;60/160,917, filed Oct. 22, 1999; 60/135,051, filed May 20, 1999;60/135,053, filed May 20, 1999; 60/135,123, filed May 20, 1999; and60/130,089, filed Apr. 20, 1999, each of which is expressly incorporatedherein by reference in its entirety.

FIELD OF THE INVENTION

The present invention is directed to methods and compositions for theuse of microsphere arrays to detect and quantify a number of nucleicacid reactions. The invention finds use in genotyping, i.e. thedetermination of the sequence of nucleic acids, particularly alterationssuch as nucleotide substitutions (mismatches) and single nucleotidepolymorphisms (SNPs). Similarly, the invention finds use in thedetection and quantification of a nucleic acid target using a variety ofamplification techniques, including both signal amplification and targetamplification. The methods and compositions of the invention can be usedin nucleic acid sequencing reactions as well. All applications caninclude the use of adapter sequences to allow for universal arrays.

BACKGROUND OF THE INVENTION

The detection of specific nucleic acids is an important tool fordiagnostic medicine and molecular biology research. Gene probe assayscurrently play roles in identifying infectious organisms such asbacteria and viruses, in probing the expression of normal and mutantgenes and identifying mutant genes such as oncogenes, in typing tissuefor compatibility preceding tissue transplantation, in matching tissueor blood samples for forensic medicine, and for exploring homology amonggenes from different species.

Ideally, a gene probe assay should be sensitive, specific and easilyautomatable (for a review, see Nickerson, Current Opinion inBiotechnology 4:48-51 (1993)). The requirement for sensitivity (i.e. lowdetection limits) has been greatly alleviated by the development of thepolymerase chain reaction PCR) and other amplification technologieswhich allow researchers to amplify exponentially a specific nucleic acidsequence before analysis (for a review, see Abramson et al., CurrentOpinion in Biotechnology, 4:41-47 (1993)).

Sensitivity, i.e. detection limits, remain a significant obstacle innucleic acid detection systems, and a variety of techniques have beendeveloped to address this issue. Briefly, these techniques can beclassified as either target amplification or signal amplification.Target amplification involves the amplification (i.e. replication) ofthe target sequence to be detected, resulting in a significant increasein the number of target molecules. Target amplification strategiesinclude the polymerase chain reaction (PCR), strand displacementamplification (SDA), and nucleic acid sequence based amplification(NASBA).

Alternatively, rather than amplify the target, alternate techniques usethe target as a template to replicate a signaling probe, allowing asmall number of target molecules to result in a large number ofsignaling probes, that then can be detected. Signal amplificationstrategies include the ligase chain reaction (LCR), cycling probetechnology (CPT), invasive cleavage techniques such as Invader™technology, Q-Beta replicase (Q(3R) technology, and the, use of“amplification probes” such as “branched DNA” that result in multiplelabel probes binding to a single target sequence.

The polymerase chain reaction (PCR) is widely used and described, andinvolves the use of primer extension combined with thermal cycling toamplify a target sequence; see U.S. Pat. Nos. 4,683,195 and 4,683,202,and PCR Essential Data, J. W. Wiley & Sons, Ed. C. R. Newton, 1995, allof which are incorporated by reference. In addition, there are a numberof variations of PCR which also find use in the invention, including“quantitative competitive PCR” or “QC-PCR” “arbitrarily primed PCR” or“AP-PCR” “immuno-PCR”, “Alu-PCR”, “PCR single strand conformationalpolymorphism” or “PCR-SSCP”, allelic PCR (see Newton et al. Nucl. AcidRes. 17:2503 91989); “reverse transcriptase PCR” or “RT-PCR”, “biotincapture PCR”, “vectorette PCR”. “panhandle PCR”, and “PCR select cDNA,subtraction”, among others.

Strand displacement amplification (SDA) is generally described in Walkeret al., in Molecular Methods for Virus Detection, Academic Press, Inc.,1995, and U.S. Pat. Nos. 5,455,166 and 5,130,238, all of which arehereby incorporated by reference.

Nucleic acid sequence based amplification (NASBA) is generally describedin U.S. Pat. No. 5,409,818 and “Profiting from Gene-based Diagnostics”,CTB International Publishing Inc., N.J., 1996, both of which areincorporated by reference.

Cycling probe technology (CPT) is a nucleic acid detection system basedon signal or probe amplification rather than target amplification, suchas is done in polymerase chain reactions (PCR). Cycling probe technologyrelies on a molar excess of labeled probe which contains a scissilelinkage of RNA. Upon hybridization of the probe to the target, theresulting hybrid contains a portion of RNA:DNA. This area of RNA:DNAduplex is recognized by RNAse H and the RNA is excised, resulting incleavage of the probe. The probe now consists of two smaller sequenceswhich may be released, thus leaving the target intact for repeatedrounds of the reaction. The unreacted probe is removed and the label isthen detected. CPT is generally described in U.S. Pat. Nos. 5,011,769,5,403,711, 5,660,988, and 4,876,187, and PCT published applications WO95/05480, WO 95/1416, and WO 95/00667, all of which are specificallyincorporated herein by reference.

The oligonucleotide ligation assay (OLA; sometimes referred to as theligation chain reaction (LCR)) involve the ligation of at least twosmaller probes into a single long probe, using the target sequence asthe template for the ligase. See generally U.S. Pat. Nos. 5,185,243,5,679,524 and 5,573,907; EP 0 320 308 B1; EP 0 336 731 B1; EP 0 439 182B1; WO 90/01069; WO 89/12696; and WO 89/09835, all of which areincorporated by reference.

Invader™ technology is based on structure-specific polymerases thatcleave nucleic acids in a site-specific manner. Two probes are used: an“invader” probe and a “signaling” probe, that adjacently hybridize to atarget sequence with a non-complementary overlap. The enzyme cleaves atthe overlap due to its recognition of the “tail”, and releases the“tail” with a label. This can then be detected. The Invader™ technologyis described in U.S. Pat. Nos. 5,846,717; 5,614,402; 5,719,028;5,541,311; and 5,843,669, all of which are hereby incorporated byreference.

“Rolling circle amplification” is based on extension of a circular probethat has hybridized to a target sequence. A polymerase is added thatextends the probe sequence. As the circular probe has no terminus, thepolymerase repeatedly extends the circular probe resulting inconcatemers of the circular probe. As such, the probe is amplified.Rolling-circle amplification is generally described in Baner et al.(1998) Nuc. Acids Res. 26:5073-5078; Barany, F. (1991) Proc. Natl Acad.Sci. USA 88:189-193; and Lizardi et al. (1998) Nat. Genet. 19:225-232,all of which are incorporated by reference in their entirety.

“Branched DNA” signal amplification relies on the synthesis of branchednucleic acids, containing a multiplicity of nucleic acid “arms” thatfunction to increase the amount of label that can be put onto one probe.This technology is generally described in U.S. Pat. Nos. 5,681,702,5,597,909, 5,545,730, 5,594,117, 5,591,584, 5,571,670, 5,580,731,5,571,670, 5,591,584, 5,624,802, 5,635,352, 5,594,118, 5,359,100,5,124,246 and 5,681,697, all of which are hereby incorporated byreference.

Similarly, dendrimers of nucleic acids serve to vastly increase theamount of label that can be added to a single molecule, using a similaridea but different compositions. This technology is as described in U.S.Pat. No. 5,175,270 and Nilsen et al., J. Theor. Biol. 187:273 (1997),both of which are incorporated herein by reference.

Specificity, in contrast, remains a problem in many currently availablegene probe assays. The extent of molecular complementarity between probeand target defines the specificity of the interaction. In a practicalsense, the degree of similarity between the target and other sequencesin the sample also has an impact on specificity. Variations in theconcentrations of probes, of targets and of salts in the hybridizationmedium, in the reaction temperature, and in the length of the probe mayalter or influence the specificity of the probe/target interaction.

It may be possible under some circumstances to distinguish targets withperfect complementarity from targets with mismatches; this is generallyvery difficult using traditional technology such as filterhybridization, in situ hybridization etc., since small variations in thereaction conditions will alter the hybridization, although this may notbe a problem if appropriate mismatch controls are provided. Newexperimental techniques for mismatch detection with standard probesinclude DNA ligation assays where single point mismatches preventligation and probe digestion assays in which mismatches create sites forprobe cleavage.

Recent focus has been on the analysis of the relationship betweengenetic variation and phenotype by making use of polymorphic DNAmarkers. Previous work utilized short tandem repeats (STRs) aspolymorphic positional markers; however, recent focus is on the use ofsingle nucleotide polymorphisms (SNPs), which occur at an averagefrequency of more than 1 per kilobase in human genomic DNA. Some SNPs,particularly those in and around coding sequences, are likely to be thedirect cause of therapeutically relevant phenotypic variants and/ordisease predisposition. There are a number of well known polymorphismsthat cause clinically important phenotypes; for example, the apoE2/3/4variants are associated with different relative risk of Alzheimer's andother diseases (see Cordor et al., Science 261(1993). Multiplex PCRamplification of SNP loci with subsequent hybridization tooligonucleotide arrays has been shown to be an accurate and reliablemethod of simultaneously genotyping at least hundreds of SNPs; see Wanget al., Science, 280:1077 (1998); see also Schafer et al., NatureBiotechnology 16:33-39 (1998). The compositions of the present inventionmay easily be substituted for the arrays of the prior art.

There are a variety of particular techniques that are used to detectsequence, including mutations and SNPs. These include, but are notlimited to, ligation based assays, cleavage based assays (mismatch andinvasive cleavage such as Invader™), single base extension methods (seeWO 92/15712, EP 0371437 B1, EP 0317 074 B1; Pastinen et al., Genome Res.7:606-614 (1997); Syvanen, Clinica Chimica Acta 226:225-236 (1994); andWO 91/13075), and competitive probe analysis (e.g. competitivesequencing by hybridization; see below).

In addition, DNA sequencing is a crucial technology in biology today, asthe rapid sequencing of genomes, including the human genome, is both asignificant goal and a significant hurdle. Thus there is a significantneed for robust, high-throughput methods. Traditionally, the most commonmethod of DNA sequencing has been based on polyacrylamide gelfractionation to resolve a population of chain-terminated fragments(Sanger et al., Proc. Natl. Acad. Sci. USA 74:5463 (1977); Maxam &Gilbert). The population of fragments, terminated at each position inthe DNA sequence, can be generated in a number of ways. Typically, DNApolymerase is used to incorporate dideoxynucleotides that serve as chainterminators.

Several alternative methods have been developed to increase the speedand ease of DNA sequencing. For example, sequencing by hybridization hasbeen described (Drmanac et al., Genomics 4:114 (1989); Koster et al.,Nature Biotechnology 14:1123 (1996); U.S. Pat. Nos. 5,525,464; 5,202,231and 5,695,940, among others). Similarly, sequencing by synthesis is analternative to gel-based sequencing. These methods add and read only onebase (or at most a few bases, typically of the same type) prior topolymerization of the next base. This can be referred to as “timeresolved” sequencing, to contrast from “gel-resolved” sequencing.Sequencing by synthesis has been described in U.S. Pat. No. 4,971,903and Hyman, Anal. Biochem. 174:423 (1968); Rosenthal, InternationalPatent Application Publication 761107 (1989); Metzker et al., Nucl.Acids Res. 22:4259 (1994); Jones, Biotechniques 22:938 (1997); Ronaghiet al., Anal. Biochem. 242:84 (1996), Nyren et al., Anal. Biochem.151:504 (1985). Detection of ATP sulfurylase activity is described inKaramohamed and Nyren, Anal. Biochem. 271:81 (1999). Sequencing usingreversible chain terminating nucleotides is described in U.S. Pat. Nos.5,902,723 and 5,547,839, and Canard and Arzumanov, Gene 11:1 (1994), andDyatkina and Arzumanov, Nucleic Acids Symp Ser 18:117 (1987). Reversiblechain termination with DNA ligase is described in U.S. Pat. No.5,403,708. Time resolved sequencing is described in Johnson et al.,Anal. Biochem. 136:192 (1984). Single molecule analysis is described inU.S. Pat. No. 5,795,782 and Elgen and Rigler, Proc. Natl Acad Sci USA91(13):5740 (1994), all of which are hereby expressly incorporated byreference in their entirety.

One promising sequencing by synthesis method is based on the detectionof the pyrophosphate (PPi) released during the DNA polymerase reaction.As nucleotriphosphates are added to a growing nucleic acid chain, theyrelease PPi. This release can be quantitatively measured by theconversion of PPi to ATP by the enzyme sulfurylase, and the subsequentproduction of visible light by firefly luciferase.

Several assay systems have been described that capitalize on thismechanism. See for example WO 93/23564, WO 98/28440 and WO 98/13523, allof which are expressly incorporated by reference. A preferred method isdescribed in Ronaghi et al., Science 281:363 (1998). In this method, thefour deoxynucleotides (dATP, dGTP, dCTP and dTTP; collectively dNTPs)are added stepwise to a partial duplex comprising a sequencing primerhybridized to a single stranded DNA template and incubated with DNApolymerase, ATP sulfurylase, luciferase, and optionally anucleotide-degrading enzyme such as apyrase. A dNTP is only incorporatedinto the growing DNA strand if it is complementary to the base in thetemplate strand. The synthesis of DNA is accompanied by the release ofPPi equal in molarity to the incorporated dNTP. The PPi is converted toATP and the light generated by the luciferase is directly proportionalto the amount of ATP. In some cases the unincorporated dNTPs and theproduced ATP are degraded between each cycle by the nucleotide degradingenzyme.

In some cases the DNA template is associated with a solid support. Tothis end, there are a wide variety of known methods of attaching DNAs tosolid supports. Recent work has focused on the attachment of bindingligands, including nucleic acid probes, to s that are randomlydistributed on a surface, including a fiber optic bundle, to form highdensity arrays. See for example PCTs US98/21193, PCT US99114387 and PCTUS98/05025; WO98/50782; and U.S. Ser. Nos. 09/287,573, 09/151,877,09/256,943, 09/316,154, 60/119,323, 09/315,584; all of which areexpressly incorporated by reference.

An additional technique utilizes sequencing by hybridization. Forexample, sequencing by hybridization has been described (Drmanac et al.,Genomics 4: 114 (1989); U.S. Pat. Nos. 5,525,464; 5,202,231 and5,695,940, among others, all of which are hereby expressly incorporatedby reference in their entirety).

In addition, sequencing using mass spectrometry techniques has beendescribed; see Koster et al., Nature Biotechnology 14: 1123 (1996).

Finally, the use of adapter-type sequences that allow the use ofuniversal arrays has been described in limited contexts; see for exampleChee et al., Nucl. Acid Res. 19:3301 (1991); Shoemaker et al., NatureGenetics 14:450 (1998); Barany, F. (1991) Proc. Natl. Acad. Sci. USA88:189-193; EP 0 799897 A1; WO 97/31256, all of which are expresslyincorporated by reference.

PCTs US98/21193, PCT US99/14387 and PCT US98 105025; WO98/50782; andU.S. Ser. Nos. 09/287,573, 09/151,877, 09/256,943, 09/316,154,60/119,323, 09/315,584; all of which are expressly incorporated byreference, describe novel compositions utilizing substrates withmicrosphere arrays, which allow for novel detection methods of nucleicacid hybridization.

Accordingly, it is an object of the present invention to providedetection and quantification methods for a variety of nucleic acidreactions, including genotyping, amplification reactions and sequencingreactions, utilizing microsphere arrays.

SUMMARY OF THE INVENTION

In accordance with the above objects, the present invention providesmethods of determining the identity of a nucleotide at a detectionposition in a target sequence. The methods comprise providing ahybridization complex comprising the target sequence and a capture probecovalently attached to a microsphere on a surface of a substrate. Themethods comprise determining the nucleotide at the detection position.The hybridization complex can comprise the capture probe, a captureextender probe, and the target sequence. In addition, the targetsequence may comprise exogenous adapter sequences.

In an additional aspect, the method comprises contacting the s with aplurality of detection probes each comprising a unique nucleotide at thereadout position and a unique detectable label. The signal from at leastone of the detectable labels is detected to identify the nucleotide atthe detection position.

In an additional aspect, the detection probe does not contain detectionlabel, but rather is identified based on its characteristic mass, forexample via mass spectrometry. In addition, the detection probecomprises a unique label that is detected based on its characteristicmass.

In a further aspect, the invention provides methods wherein the targetsequence comprises a first target domain directly 5′ adjacent to thedetection position. The hybridization complex comprises the targetsequence, a capture probe and an extension primer hybridized to thefirst target domain of the target sequence. The determination stepcomprises contacting the s with a polymerase enzyme, and a plurality ofNTPs each comprising a covalently attached detectable label, underconditions whereby if one of the NTPs basepairs with the base at thedetection position, the extension primer is extended by the enzyme toincorporate the label. As is known to those in the art, dNTPs and ddNTPsare the preferred substrates for DNA polymerases. NTPs are the preferredsubstrates for RNA polymerases. The base at the detection position isthen identified.

In an additional aspect, the invention provides methods wherein thetarget sequence comprises a first target domain directly 5′ adjacent tothe detection position, wherein the capture probe serves as an extensionprimer and is hybridized to the first target domain of the targetsequence. The determination step comprises contacting the s with apolymerase enzyme, and a plurality of NTPs each comprising a covalentlyattached detectable label, under conditions whereby if one of the NTPsbasepairs with the base at the detection position, the extension primeris extended by the enzyme to incorporate the label. The base at thedetection position is thus identified.

In a further aspect, the invention provides methods wherein the targetsequence comprises (5′ to 3′), a first target domain comprising anoverlap domain comprising at least a nucleotide in the detectionposition and a second target domain contiguous with the detectionposition. The hybridization complex comprises a first probe hybridizedto the first target domain, and a second probe hybridized to the secondtarget domain. The second probe comprises a detection sequence that doesnot hybridize with the target sequence, and a detectable label. If thesecond probe comprises a base that is perfectly complementary to thedetection position a cleavage structure is formed. The method furthercomprises contacting the hybridization complex with a cleavage enzymethat will cleave the detection sequence from the signaling probe andthen forming an assay complex with the detection sequence, a captureprobe covalently attached to a microsphere on a surface of a substrate,and at least one label. The base et the detection position is thusidentified.

In an additional aspect, the invention provides methods of determiningthe identification of a nucleotide at a detection position in a targetsequence comprising a first target domain comprising the detectionposition and a second target domain adjacent to the detection position.The method comprises hybridizing a first ligation probe to the firsttarget domain, and hybridizing a second ligation probe to the secondtarget domain. If the second ligation probe comprises a base that isperfectly complementary to the detection position a ligation structureis formed. A ligation enzyme is provided that will ligate the first andthe second ligation probes to form a ligated probe. An assay complex isformed with the ligated probe, a capture probe covalently attached to amicrosphere on a surface of a substrate, and at least one label. Thebase at the detection position is thus identified.

In a further aspect, the present invention provides methods ofsequencing a plurality of target nucleic acids. The methods compriseproviding a plurality of hybridization complexes each comprising atarget sequence and a sequencing primer that hybridizes to the firstdomain of the target sequence, the hybridization complexes are attachedto a surface of a substrate. The methods comprise extending each of theprimers by the addition of a first nucleotide to the first detectionposition using an enzyme to form an extended primer. The methodscomprise detecting the release of pyrophosphate (PPi) to determine thetype of the first nucleotide added onto the primers. In one aspect thehybridization complexes are attached to s distributed on the surface. Inan additional aspect the sequencing primers are attached to the surface.The hybridization complexes comprise the target sequence, the sequencingprimer and a capture probe covalently attached to the surface. Thehybridization complexes also comprise an adapter probe.

In an additional aspect, the method comprises extending the extendedprimer by the addition of a second nucleotide to the second detectionposition using an enzyme and detecting the release of pyrophosphate todetermine the type of second nucleotide added onto the primers. In anadditional aspect, the pyrophosphate is detected by contacting thepyrophosphate with a second enzyme that converts pyrophosphate into ATP,and detecting the ATP using a third enzyme. In one aspect, the secondenzyme is sulfurylase and/or the third enzyme is luciferase.

In an additional aspect, the invention provides methods of sequencing atarget nucleic acid comprising a first domain and an adjacent seconddomain, the second domain comprising a plurality of target positions.The method comprises providing a hybridization complex comprising thetarget sequence and a capture probe covalently attached to s on asurface of a substrate and determining the identity of a plurality ofbases at the target positions. The hybridization complex comprises thecapture probe, an adapter probe, and the target sequence. In one aspectthe sequencing primer is the capture probe.

In an additional aspect of the invention, the determining comprisesproviding a sequencing primer hybridized to the second domain, extendingthe primer by the addition of first nucleotide to the first detectionposition using a first enzyme to form an extended primer, detecting therelease of pyrophosphate to determine the type of the first nucleotideadded onto the primer, extending the primer by the addition of a secondnucleotide to the second detection position using the enzyme, anddetecting the release of pyrophosphate to determine the type of thesecond nucleotide added onto the primer. In an additional aspectpyrophosphate is detected by contacting the pyrophosphate with thesecond enzyme that converts pyrophosphate into ATP, and detecting theATP using a third enzyme. In one aspect the second enzyme is sulfurylaseand/or the third enzyme is luciferase.

In an additional aspect of the method for sequencing, the determiningcomprises providing a sequencing primer hybridized to the second domain,extending the primer by the addition of a first protected nucleotideusing a first enzyme to form an extended primer, determining theidentification of the first protected nucleotide, removing theprotection group, adding a second protected nucleotide using the enzyme,and determining the identification of the second protected nucleotide.

In an additional aspect the invention provides a kit for nucleic acidsequencing comprising a composition comprising a substrate with asurface comprising discrete sites and a population of s distributed onthe sites, wherein the s comprise capture probes. The kit also comprisesan extension enzyme and dNTPs. The kit also comprises a second enzymefor the conversion of pyrophosphate to ATP and a third enzyme for thedetection of ATP. In one aspect the dNTPs are labeled. In addition eachdNTP comprises a different label.

In a further aspect, the present invention provides methods of detectinga target nucleic acid sequence comprising attaching a first adapternucleic acid to a first target nucleic acid sequence to form a modifiedfirst target nucleic acid sequence, and contacting the modified firsttarget nucleic acid sequence with an array as outlined herein. Thepresence of the modified first target nucleic acid sequence is thendetected.

In an additional aspect, the methods further comprise attaching a secondadapter nucleic acid to a second target nucleic acid sequence to form amodified second target nucleic acid sequence and contacting the modifiedsecond target nucleic acid sequence with the array.

In a further aspect, the invention provides methods of detecting atarget nucleic acid sequence comprising hybridizing a first primer to afirst portion of a target sequence, wherein the first primer furthercomprises an adapter sequence and hybridizing a second primer to asecond portion of the target sequence. The first and second primers areligated together to form a modified primer, and the adapter sequence ofthe modified primer is contacted with an array of the invention, toallow detection of the presence of the modified primer.

In an additional embodiment, the present invention provides a method fordetecting a first target nucleic acid sequence. In one aspect the methodcomprises hybridizing at least a first primer nucleic acid to the firsttarget sequence to form a first hybridization complex, contacting thefirst hybridization complex with a first enzyme to form a modified firstprimer nucleic acid, disassociating the first hybridization complex,contacting the modified first primer nucleic acid with an arraycomprising a substrate with a surface comprising discrete sites and apopulation of s comprising at least a first subpopulation comprising afirst capture probe such that the first capture probe and the modifiedprimer form an assay complex, wherein the s are distributed on thesurface, and detecting the presence of the modified primer nucleic acid.

In addition the method further comprises hybridizing at least a secondprimer nucleic acid to a second target sequence that is substantiallycomplementary to the first target sequence to form a secondhybridization complex, contacting the second hybridization complex withthe first enzyme to form modified second primer nucleic acid,disassociating the second hybridization complex and forming a secondassay complex comprising the modified second primer nucleic acid and asecond capture probe on a second subpopulation.

In an additional aspect of the invention the primer forms a circularprobe following hybridization with the target nucleic acid to form afirst hybridization complex and contacting the first hybridizationcomplex with a first enzyme comprising a ligase such that theoligonucleotide ligation assay (OLA) occurs. This is followed by addingthe second enzyme, a polymerase, such that the circular probe isamplified in a rolling circle amplification (RCA) assay.

In an additional aspect of the invention, the first enzyme comprises aDNA polymerase and the modification is an extension of the primer suchthat the polymerase chain reaction (PCR) occurs. In an additional aspectof the invention the first enzyme comprises a ligase and themodification comprises a ligation of the first primer which hybridizesto a first domain of the first target sequence, to a third primer whichhybridizes to a second adjacent domain of the first target sequence suchthat the ligase chain reaction (LCR) occurs.

In an additional aspect of the invention, the first primer comprises afirst probe sequence, a first scissile linkage and a second probesequence, wherein the first enzyme will cleave the scissile linkageresulting in the separation of the first and second probe sequences andthe disassociation of the first hybridization complex, leaving the firsttarget sequence intact such that the cycling probe technology (CPT)reaction occurs.

In addition, wherein the first enzyme is a polymerase that extends thefirst primer and the modified first primer comprises a first newlysynthesized strand, the method further comprises the addition of asecond enzyme comprising a nicking enzyme that nicks the extended firstprimer leaving the first target sequence intact, and extending from thenick using the polymerase, and thereby displacing the first newlysynthesized strand and generating a second newly synthesized strand suchthat strand displacement amplification (SBA) occurs.

In addition, wherein the first target sequence is an RNA targetsequence, the first primer nucleic acid is a DNA primer comprising anRNA polymerase promoter, the first enzyme is a reverse-transcriptasethat extends the first primer to form a first newly synthesized DNAstrand, the method further comprises the addition of a second enzymecomprising an RNA degrading enzyme that degrades the first targetsequence, the addition of a third primer that hybridizes to the firstnewly synthesized DNA strand, the addition of a third enzyme comprisinga DNA polymerase that extends the third primer to form a second newlysynthesized DNA strand, to form a newly synthesized DNA hybrid, theaddition of a fourth enzyme comprising an RNA polymerase that recognizesthe RNA polymerase promoter and generates at least one newly synthesizedRNA strand from the DNA hybrid, such that nucleic acid sequence-basedamplification (NASBA) occurs.

In addition, wherein the first primer is an invader primer, the methodfurther comprises hybridizing a signaling primer to the target sequence,the enzyme comprises a structure-specific cleaving enzyme and themodification comprises a cleavage of said signaling primer, such thatthe invasive cleavage reaction occurs.

An additional aspect of the invention is a method for detecting a targetnucleic acid sequence comprising hybridizing a first primer to a firsttarget sequence to form a first hybridization complex, contacting thefirst hybridization complex with a first enzyme to extend the firstprimer to form a first newly synthesized strand and form a nucleic acidhybrid that comprises an RNA polymerase promoter, contacting the hybridwith an RNA polymerase that recognizes the RNA polymerase promoter andgenerates at least one newly synthesized RNA strand, contacting thenewly synthesized RNA strand with an array comprising a substrate with asurface comprising discrete sites and a population of s comprising atleast a first subpopulation comprising a first capture probe; such thatthe first capture probe and the modified primer form an assay complex;wherein the s are distributed on the surface and detecting the presenceof the newly synthesized RNA strand.

In addition, when the target nucleic acid sequence is an RNA sequence,and prior to hybridizing a first primer to a first target sequence toform a first hybridization complex, method comprises hybridizing asecond primer comprising an RNA polymerase promoter sequence to the RNAsequence to form a second hybridization complex, contacting the secondhybridization complex with a second enzyme to extend the second primerto form a second newly synthesized strand and form a nucleic acidhybrid; and degrading the RNA sequence to leave the second newlysynthesized strand as the first target sequence. In one aspect of theinvention the degrading is done by the addition of an RNA degradingenzyme. In an additional aspect of the invention the degrading is doneby RNA degrading activity of reverse transcriptase.

In addition, when the target nucleic acid sequence is a DNA sequence,and prior to hybridizing a first primer to a first target sequence toform a first hybridization complex, the method comprises hybridizing asecond primer comprising an RNA polymerase promoter sequence to the DNAsequence to form a second hybridization complex, contacting the secondhybridization complex with a second enzyme to extend the second primerto form a second newly synthesized strand and form a nucleic acidhybrid, and denaturing the nucleic acid hybrid such that the secondnewly synthesized strand is the first target sequence.

An additional aspect of the invention is a kit for the detection of afirst target nucleic acid sequence. The kit comprises at least a firstnucleic acid primer substantially complementary to at least a firstdomain of the target sequence, at least a first enzyme that will modifythe first nucleic acid primer, and an array comprising a substrate witha surface comprising discrete sites, and a population of s comprising atleast a first and a second subpopulation, wherein each subpopulationcomprises a bioactive agent, wherein the s are distributed on thesurface.

In an additional aspect of the invention, is a kit for the detection ofa PCR reaction wherein the first enzyme is a thermostable DNApolymerase.

In an additional aspect of the invention, is a kit for the detection ofa LCR reaction wherein the first enzyme is a ligase and the kitcomprises a first nucleic acid primer substantially complementary to afirst domain of the first target sequence and a third nucleic acidprimer substantially complementary to a second adjacent domain of thefirst target sequence.

In an additional aspect of the invention, is a kit for the detection ofa strand displacement amplification (SDA) reaction wherein the firstenzyme is a polymerase and the kit further comprises a nicking enzyme.

In an additional aspect of the invention, is a kit for the detection ofa NASBA reaction wherein the first enzyme is a reverse transcriptase,and the kit comprises a second enzyme comprising an RNA degradingenzyme, a third primer, a third enzyme comprising a DNA polymerase and afourth enzyme comprising an RNA polymerase.

In an additional aspect of the invention, is a kit for the detection ofan invasive cleavage reaction wherein the first enzyme is astructure-specific cleaving enzyme, and the kit comprises a signalingprimer.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A, 1B and 1C depict three different embodiments for attaching atarget sequence to an array. The solid support 5 has microsphere 10 withcapture probe 20 linked via a linker 15. FIG. 1A depicts directattachment; the capture probe 20 hybridizes to a first portion of thetarget sequence 25. FIG. 1B depicts the use of a capture extender probe30 that has a first portion that hybridizes to the capture probe 20 anda second portion that hybridizes to a first domain of the targetsequence 25. FIG. 1C shows the use of an adapter sequence 35, that hasbeen added to the target sequence, for example during an amplificationreaction as outlined herein.

FIGS. 2A and 2B depict two preferred embodiments of SBE amplification.FIG. 2A shows extension primer 40 hybridized to the target sequence 25.Upon addition of the extension enzyme and labelled nucleotides, theextension primer is modified to form a labelled primer 41. The reactioncan be repeated and then the labelled primer is added to the array asabove. FIG. 2B depicts the same reaction but using adapter sequences.

FIGS. 3A and 3B depict two preferred embodiments of OLA amplification.FIG. 3A depicts a first ligation probe 45 and a second ligation probe 50with a label 55. Upon addition of the ligase, the probes are ligated.The reaction can be repeated and then the ligated primer is added to thearray as above. FIG. 3B depicts the same reaction but using adaptersequences.

FIG. 4 depicts a preferred embodiment of the invasive cleavage reaction.In this embodiment, the signaling probe 65 comprises two portions, adetection sequence 67 and a signaling portion 66. The signaling portioncan serve as an adapter sequence. In addition, the signaling portiongenerally comprises the label 55, although as will be appreciated bythose in the art, the label may be on the detection sequence as well. Inaddition, for optional removal of the uncleaved probes, a capture tag 60may also be used. Upon addition of the enzyme, the structure is cleaved,releasing the signaling portion 66. The reaction can be repeated andthen the signaling portion is added to the array as above.

FIGS. 5A and 5B depict two preferred embodiments of CPT amplification. ACPT primer 70 comprising a label 55, a first probe sequence 71 and asecond probe sequence 73, separated by a scissile linkage 72, andoptionally comprising a capture tag 60, is hybridized to the targetsequence 25. Upon addition of the enzyme, the scissile linkage iscleaved. The reaction can be repeated and then the probe sequencecomprising the label is added to the array as above. FIG. 5B depicts thesame reaction but using adapter sequences.

FIGS. 6A and 6B depict OLA/RCA amplification using a single “padlockprobe” 57. The padlock probe is hybridized with a target sequence 25.When the probe 57 is complementary to the target sequence 26, ligationof the probe termini occurs forming a circular probe 28. When the probe57 is not complementary to the target sequence 27, ligation does notoccur. Addition of polymerase and nucleotides to the circular proberesults amplification of the probe 58. Cleavage of the amplified probe58 yields fragments 59 that hybridize with an identifier probe 21immobilized on a microsphere 10.

FIGS. 7A, 7B, 7C, 7D, 7E, and 7F depict an alternative method ofOLA/RCA. An immobilized first OLA primer 45 is hybridized with a targetsequence 25 and a second OLA primer 50. Following the addition ofligase, the first and second OLA primers are ligated to form a ligatedoligonucleotide 56. Following denaturation to remove the target nucleicacid, the immobilized ligated oligonucleotide is distributed on anarray. An RCA probe 57 and polymerase are added to the array resultingin amplification of the circular RCA probe 58.

FIGS. 8A, 8B, 8C, 8D and 8E schematically depict the use of readoutprobes for genotyping. FIG. 8A shows a “sandwich” format. Substrate 5has a discrete site with a microsphere 10 comprising a capture probe 20attached via a linker 15. The target sequence 25 has a first domain thathybridizes to the capture probe 20 and a second domain comprising adetection position 30 that hybridizes to a readout probe 40 with readoutposition 35. As will be appreciated by those in the art, FIG. 8A depictsa single detection position; however, depending on the system, aplurality of different probes can hybridize to different target domains;hence n is an integer of 1 or greater. FIG. 8B depicts the use of acapture probe 20 that also serves as a readout probe. FIG. 8C depictsthe use of an adapter probe 100 that binds to both the capture probe 20and the target sequence 25. As will be appreciated by those in the art,the figure depicts that the capture probe 20 and target sequence 25 bindadjacently and as such may be ligated; however, as will be appreciatedby those in the art, there may be a “gap” of one or more nucleotides.FIG. 8D depicts a solution based assay. Two readout probes 40, each witha different readout position (35 and 36) and different labels (45 and46) are added to target sequence 25 with detection position 35, to forma hybridization complex with the match probe. This is added to thearray; FIG. 8D depicts the use of a capture probe 20 that directlyhybridizes to a first domain of the target sequence, although otherattachments may be done. FIG. 8E depicts the direct attachment of thetarget sequence to the array.

FIGS. 9A, 9B, 9C, 9D, 9E, 9F and 9G depict preferred embodiments for SBEgenotyping. FIG. 9A depicts a “sandwich” assay, in which substrate 5 hasa discrete site with a microsphere 10 comprising a capture probe 20attached via a linker 15. The target sequence 25 has a first domain thathybridizes to the capture probe 20 and a second domain comprising adetection position 30 that hybridizes to an extension primer 50. As willbe appreciated by those in the art, FIG. 9A depicts a single detectionposition; however, depending on the system, a plurality of differentprimers can hybridize to different target domains; hence n is an integerof 1 or greater. In addition, the first domain of the target sequencemay be an adapter sequence. FIG. 9B depicts the use of a capture probe20 that also serves as an extension primer. FIG. 9C depicts the solutionreaction. FIG. 9D depicts the use of a capture extender probe 100, thathas a first domain that will hybridize to the capture probe 20 and asecond domain that will hybridize to a first domain of the targetsequence 25. FIG. 9E depicts the addition of a ligation step prior toextension of the extension probe. FIG. 9F depicts the addition of aligation step after the extension of the extension probe. FIG. 9Gdepicts the SBE solution reaction followed by hybridization of theproduct of the reaction to the bead array to capture an adaptersequence.

FIGS. 10A, 10B, 10C, 10D and 10E depict some of the OLA genotypingembodiments of the reaction. FIG. 10A depicts the solution reaction,wherein the target sequence 25 with a detection position 30 hybridizesto the first ligation probe 75 with readout position 35 and second probe76 with a detectable label 45. As will be appreciated by those in theart, the second ligation probe could also contain the readout position.The addition of a ligase forms a ligated probe 80, that can then beadded to the array with a capture probe 20. FIG. 10B depicts an “onbead” assay, wherein the capture probe 20 serves as the first ligationprobe. FIG. 10C depicts a sandwich assay, using a capture probe 20 thathybridizes to a first portion of the target sequence 25 (which may be anendogenous sequence or an exogenous adapter sequence) and ligationprobes 75 and 76 that hybridize to a second portion of the targetsequence comprising the detection position 30. FIG. 10D depicts the useof a capture extender probe 100. FIG. 10E depicts a solution based assaywith the use of an adapter sequence 110.

FIGS. 11A, 11B and 11C depict the SPOLA reaction for genotyping. In FIG.11A, two ligation probes are hybridized to a target sequence. As will beappreciated by those in the art, this system requires that the twoligation probes be attached at different ends, i.e. one at the 5′ endand one at the 3′ end. One of the ligation probes is attached via acleavable linker. Upon formation of the assay complex and the additionof a ligase, the two probes will efficiently covalently couple the twoligation probes if perfect complementarity at the junction exists. InFIG. 11B, the resulting ligation difference between correctly matchedprobes and imperfect probes is shown. FIG. 11C shows that subsequentcleavage of the cleavable linker produces a reactive group, in this casean amine, that may be subsequently labeled as outlined herein.Alternatively, cleavage may leave an upstream oligo with a detectablelabel. If not ligated, this labeled oligo can be washed away.

FIGS. 12A and 12B depict two cleavage reactions for genotyping. FIG. 12Adepicts a loss of signal assay, wherein a label 45 is cleaved off due tothe discrimination of the cleavage enzyme such as a restrictionendonuclease or resolvase type enzyme to allow single base mismatchdiscrimination. FIG. 12B depicts the use of a quencher 46.

FIGS. 13A, 13B, 13C, 13D, and 13E depict the use of invasive cleavage todetermine the identity of the nucleotide at the detection position.FIGS. 13A and 13B depict a loss of signal assay. FIG. 13A depicts theinvader probe 55 with readout position 35 hybridized to the targetsequence 25 which is attached via a capture probe 20 to the surface. Thesignal probe 60 with readout position 35, detectable label 45 anddetection sequence 65 also binds to the target sequence 25; the twoprobes form a cleavage structure. If the two readout positions 35 arecapable of basepairing to the detection position 30 the addition of astructure-specific cleavage enzyme releases the detection sequence 65and consequently the label 45, leading to a loss of signal. FIG. 13B isthe same, except that the capture probe 20 also serves as the invaderprobe. FIG. 13C depicts a solution reaction, wherein the signaling probecan comprise a capture tag 70 to facilitate the removal of uncleavedsignal probes. The addition of the cleaved signal probe (e.g. thedetection sequence 65) with its associated label 45 results indetection. FIG. 13D depicts a solution based assay using a label probe120. FIG. 13E depicts a preferred embodiment of an invasive cleavagereaction that utilizes a fluorophore-quencher reaction. FIG. 13E has the3′ end of the signal probe 60 is attached to the bead 10 and comprises alabel 45 and a quencher 46. Upon formation of the assay complex andsubsequent cleavage, the quencher 46 is removed, leaving the fluorophore45.

FIGS. 14A, 14B, 14C and 14D depict genotyping assays based on the novelcombination of competitive hybridization and extension. FIGS. 14A, 14Band 14C depict solution based assays. After hybridization of theextension probe 50 with a match base at the readout position 35, anextension enzyme and dNTP is added, wherein the dNTP comprises ablocking moiety (to facilitate removal of unextended primers) or ahapten to allow purification of extended primer, i.e. biotin, DNP,fluorescein, etc. FIG. 14B depicts the same reaction with the use of anadapter sequence 90; in this embodiment, the same adapter sequence 90may be used for each readout probe for an allele. FIG. 14C depicts theuse of different adapter sequences 90 for each readout probe; in thisembodiment, unreacted primers need not be removed, although they may be.FIG. 14D depicts a solid phase reaction, wherein the dNTP added in theposition adjacent to the readout position 35 is labeled.

FIGS. 15A and 15B depict genotyping assays based on the novelcombination of invasive cleavage and ligation reactions. FIG. 15A is asolution reaction, with the signaling probe 60 comprising a detectionsequence 65 with a detectable label 45. After hybridization with thetarget sequence 25 and cleavage, the free detection sequence can bind toan array (depicted herein as a bead array, although any nucleic acidarray can be used), using a capture probe 20 and a template targetsequence 26 for the ligation reaction. In the absence of ligation, thesignaling probe is washed away. FIG. 15B depicts a solid phase assay. Inthis embodiment, the 5′ end of the signaling probe is attached to thearray (again, depicted herein as a bead array, although any nucleic acidarray can be used), and a blocking moiety is used at the 3′ end. Aftercleavage, a free 3 end is generated, that can then be used for ligationusing a template target 26. As will be appreciated by those in the art,the orientation of this may be switched, such that the 3′ end of thesignaling probe 60 is attached, and a free 5′ end is generated for theligation reaction.

FIGS. 16A and 16B depict genotyping assays based on the novelcombination of invasive cleavage and extension reactions. FIG. 16Adepicts an initial solution based assay, using a signaling probe with ablocked 3′ end. After cleavage, the detection sequence can be added toan array and a template target added, followed by extension to add adetectable label. Alternatively, the extension can also happen insolution, using a template target 26, followed by addition of theextended probe to the array. FIG. 16B depicts the solid phase reaction;as above, either the 3′ or the 5′ end can be attached. By using ablocking moiety 47, only the newly cleaved ends may be extended.

FIGS. 17A, 17B and 17C depict three configurations of the combination ofligation and extension (“Genetic Bit” analysis) for genotyping. FIG. 17Adepicts a reaction wherein the capture probe 20 and the extension probeserve as two ligation probes, and hybridize adjacently to the targetsequence, such that an additional ligation step may be done. A labelednucleotide is added at the readout position. FIG. 17B depicts apreferred embodiment, wherein the ligation probes (one of which is thecapture probe 20) are separated by the detection position 30. Theaddition of a labeled dNTP, extension enzyme and ligase thus serve todetect the readout position. FIG. 17C depicts the solution phase assay.As will be appreciated by those in the art, an extra level ofspecificity is added if the capture probe 20 spans the ligated probe 80.

DETAILED DESCRIPTION OF THE INVENTION

The present invention is directed to the detection and quantification ofa variety of nucleic acid reactions, particularly using microspherearrays. In particular, the invention relates to the detection ofamplification, genotyping, and sequencing reactions. In addition, theinvention can be utilized with adapter sequences to create universalarrays.

Accordingly, the present invention provides compositions and methods fordetecting and/or quantifying the products of nucleic acid reactions,such as target nucleic acid sequences, in a sample. As will beappreciated by those in the art, the sample solution may comprise anynumber of things, including, but not limited to, bodily fluids(including, but not limited to, blood, urine, serum, lymph, saliva, analand vaginal secretions, perspiration and semen, of virtually anyorganism, with mammalian samples being preferred and human samples beingparticularly preferred); environmental samples (including, but notlimited to, air, agricultural, water and soil samples); biologicalwarfare agent samples; research samples; purified samples, such aspurified genomic DNA, RNA, proteins, etc.; raw samples (bacteria, virus,genomic DNA, etc. As will be appreciated by those in the art, virtuallyany experimental manipulation may have been done on the sample.

The present invention provides compositions and methods for detectingthe presence or absence of target nucleic acid sequences in a sample. By“nucleic acid” or “oligonucleotide” or grammatical equivalents hereinmeans at least two nucleotides covalently linked together. A nucleicacid of the present invention will generally contain phosphodiesterbonds, although in some cases, as outlined below, nucleic acid analogsare included that may have alternate backbones, comprising, for example,phosphoramide (Beaucage et al., Tetrahedron 49(10):1925 (1993) andreferences therein; Letsinger, J. Org. Chem. 35:3800 (1970); Sprinzl etal., Eur. J. Biochem. 81:579 (1977); Letsinger et al., Nucl. Acids Res.14:3487 (1986); Sawai et al, Chem. Lett. 805 (1984), Letsinger et al.,J. Am. Chem. Soc. 110:4470 (1988); and Pauwels et al., Chemica Scripta26: 141 91986)), phosphorothioate (Mag et al., Nucleic Acids Res. 19:1437 (1991); and U.S. Pat. No. 5,644,048), phosphorodithioate (Briu etal., J. Am. Chem. Soc. 111:2321 (1989), O-methylphophoroamidite linkages(see Eckstein, Oligonucleotides and Analogues: A Practical Approach,Oxford University Press), and peptide nucleic acid backbones andlinkages (see Egholm, J. Am. Chem. Soc. 114:1895 (1992); Meier et al.,Chem. Int. Ed. Engl. 31:1008 (1992); Nielsen, Nature 365:566 (1993);Carlsson et al., Nature 380:207 (1996), all of which are incorporated byreference). Other analog nucleic acids include those with positivebackbones (Denpcy et al., Proc. Nat; Acad. Sci. USA 92:6097 (1995);non-ionic backbones (U.S. Pat. Nos. 5,386,023, 5,637,684, 5,602,240,5,216,141 and 4,469,863; Kiedrowshi et al., Angew. Chem. Intl. Ed.English 30:423 (1991); Letsinger et al., J. Am. Chem. Soc. 110:4470(1988); Letsinger et al, Nucleoside & Nucleotide 13:1597 (1994);Chapters 2 and 3, ASC Symposium Series 580, “Carbohydrate Modificationsin Antisense Research”, Ed. Y. S. Sanghui and P. Dan Cook; Mesmaeker etal., Bioorganic & Medicinal Chem. Lett. 4:395 (1994); Jeffs et al., J.Biomolecular NMR 34:17 (1994); Tetrahedron Lett. 37:743 (1996)) andnon-ribose backbones, including those described in U.S. Pat. Nos.5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580,“Carbohydrate Modifications in Antisense Research”, Ed. Y. S. Sanghuiand P. Dan Cook. Nucleic acids containing one or more carbocyclic sugarsare also included within the definition of nucleic acids (see Jenkins etal., Chem. Soc. Rev. (1995) pp 169-176). Several nucleic acid analogsare described in Rawls, C & E News Jun. 2, 1997, page 35. All of thesereferences are hereby expressly incorporated by reference. Thesemodifications of the ribose-phosphate backbone may be done to facilitatethe addition of labels, or to increase the stability and half-life ofsuch molecules in physiological environments.

As will be appreciated by those in the art, all of these nucleic acidanalogs may find use in the present invention. In addition, mixtures ofnaturally occurring nucleic acids and analogs can be made.Alternatively, mixtures of different nucleic acid analogs, and mixturesof naturally occurring nucleic acids and analogs may be made.

Particularly preferred are peptide nucleic acids (PNA) which includespeptide nucleic acid analogs. These backbones are substantiallynon-ionic under neutral conditions, in contrast to the highly chargedphosphodiester backbone of naturally occurring nucleic acids. Thisresults in two advantages. First, the PNA backbone exhibits improvedhybridization kinetics. PNAs have larger changes in the meltingtemperature (Tm) for mismatched versus perfectly matched basepairs. DNAand RNA typically exhibit a 2-4° C. drop in Tm for an internal mismatch.With the non-ionic PNA backbone, the drop is closer to 7-9° C. Thisallows for better detection of mismatches. Similarly, due to theirnon-ionic nature, hybridization of the bases attached to these backbonesis relatively insensitive to salt concentration.

The nucleic acids may be single stranded or double stranded, asspecified, or contain portions of both double stranded or singlestranded sequence. The nucleic acid may be DNA, both genomic and cDNA,RNA or a hybrid, where the nucleic acid contains any combination ofdeoxyribo- and ribo-nucleotides, and any combination of bases, includinguracil, adenine, thymine, cytosine, guanine, inosine, xathaninehypoxathanine, isocytosine, isoguanine, etc. A preferred embodimentutilizes isocytosine and isoguanine in nucleic acids designed to becomplementary to other probes, rather than target sequences, as thisreduces non-specific hybridization, as is generally described in U.S.Pat. No. 5,681,702. As used herein, the term “nucleoside” includesnucleotides as well as nucleoside and nucleotide analogs, and modifiednucleosides such as amino modified nucleosides. In addition,“nucleoside” includes non-naturally occurring analog structures. Thusfor example the individual units of a peptide nucleic acid, eachcontaining a base, are referred to herein as a nucleoside.

The compositions and methods of the invention are directed to thedetection of target sequences. The term “target sequence” or “targetnucleic acid” or grammatical equivalents herein means a nucleic acidsequence on a single strand of nucleic acid. The target sequence may bea portion of a gene, a regulatory sequence, genomic DNA, cDNA, RNAincluding mRNA and rRNA, or others. As is outlined herein, the targetsequence may be a target sequence from a sample, or a secondary targetsuch as a product of a reaction such as a detection sequence from aninvasive cleavage reaction, a ligated probe from an OLA reaction, anextended probe from a peR or SBE reaction, etc. Thus, for example, atarget sequence from a sample is amplified to produce a secondary targetthat is detected; alternatively, an amplification step is done using asignal probe that is amplified, again producing a secondary target thatis detected. The target sequence may be any length, with theunderstanding that longer sequences are more specific. As will beappreciated by those in the art, the complementary target sequence maytake many forms. For example, it may be contained within a largernucleic acid sequence, i.e. all or part of a gene or mRNA; a restrictionfragment of a plasmid or genomic DNA, among others. As is outlined morefully below, probes are made to hybridize to target sequences todetermine the presence, absence or quantity of a target sequence in asample. Generally speaking, this term will be understood by thoseskilled in the art. The target sequence may also be comprised ofdifferent target domains; for example, in “sandwich” type assays asoutlined below, a first target domain of the sample target sequence mayhybridize to a capture probe or a portion of capture extender probe, asecond target domain may hybridize to a portion of an amplifier probe, alabel probe, or a different capture or capture extender probe, etc. Inaddition, the target domains may be adjacent (i.e. contiguous) orseparated. For example, when OLA techniques are used, a first primer mayhybridize to a first target domain and a second primer may hybridize toa second target domain; either the domains are adjacent, or they may beseparated by one or more nucleotides, coupled with the use of apolymerase and dNTPs, as is more fully outlined below. The terms “first’and “second” are not meant to confer an orientation of the sequenceswith respect to the 5′-3′ orientation of the target sequence. Forexample, assuming a 5′-3′ orientation of the complementary targetsequence, the first target domain may be located either 5′ to the seconddomain, or 3′ to the second domain, in addition, as will be appreciatedby those in the art, the probes on the surface of the array (e.g.attached to the s) may be attached in either orientation, either suchthat they have a free 3′ end or a free 5′ end; in some embodiments, theprobes can be attached at one or more internal positions, or at bothends.

If required, the target sequence is prepared using known techniques. Forexample, the sample may be treated to lyse the cells, using known lysisbuffers, sonication, electroporation, etc., with purification andamplification as outlined below occurring as needed, as will beappreciated by those in the art. In addition, the reactions outlinedherein may be accomplished in a variety of ways, as will be appreciatedby those in the art. Components of the reaction may be addedsimultaneously, or sequentially, in any order, with preferredembodiments outlined below. In addition, the reaction may include avariety of other reagents which may be included in the assays. Theseinclude reagents like salts, buffers, neutral proteins, e.g. albumin,detergents, etc., which may be used to facilitate optimal hybridizationand detection, and/or reduce non-specific or background interactions.Also reagents that otherwise improve the efficiency of the assay, suchas protease inhibitors, nuclease inhibitors, anti-microbial agents,etc., may be used, depending on the sample preparation methods andpurity of the target.

In addition, in most embodiments, double stranded target nucleic acidsare denatured to render them single stranded so as to permithybridization of the primers and other probes of the invention. Apreferred embodiment utilizes a thermal step, generally by raising thetemperature of the reaction to about 95° C., although pH changes andother techniques may also be used.

As outlined herein; the invention provides a number of different primersand probes. By “primer nucleic acid” herein is meant a probe nucleicacid that will hybridize to some portion, i.e. a domain, of the targetsequence. Probes of the present invention are designed to becomplementary to a target sequence (either the target sequence of thesample or to other probe sequences, as is described below), such thathybridization of the target sequence and the probes of the presentinvention occurs. As outlined below, this complementarity need not beperfect; there may be any number of base pair mismatches which willinterfere with hybridization between the target sequence and the singlestranded nucleic acids of the present invention. However, if the numberof mutations is so great that no hybridization can occur under even theleast stringent of hybridization conditions, the sequence is not acomplementary target sequence. Thus, by “substantially complementary”herein is meant that the probes are sufficiently complementary to thetarget sequences to hybridize under normal reaction conditions.

A variety of hybridization conditions may be used in the presentinvention, including high, moderate and low stringency conditions; seefor example Maniatis et al., Molecular Cloning: A Laboratory Manual, 2dEdition, 1989, and Short Protocols in Molecular Biology, ed. Ausubel, etal, hereby incorporated by reference. Stringent conditions aresequence-dependent and will be different in different circumstances.Longer sequences hybridize specifically at higher temperatures. Anextensive guide to the hybridization of nucleic acids is found inTijssen, Techniques in Biochemistry and Molecular Biology—Hybridizationwith Nucleic Acid Probes, “Overview of principles of hybridization andthe strategy of nucleic acid assays” (1993). Generally, stringentconditions are selected to be about 5-10° C. lower than the thermalmelting point (Tm) for the specific sequence at a defined ionic strengthand pH. The Tm is the temperature (under defined ionic strength, pH andnucleic acid concentration) at which 50% of the probes complementary tothe target hybridize to the target sequence at equilibrium (as thetarget sequences are present in excess, at Tm, 50% of the probes areoccupied at equilibrium). Stringent conditions will be those in whichthe salt concentration is less than about 1.0 M sodium ion, typicallyabout 0.01 to 1.0 M sodium ion concentration (or other salts) at pH 7.0to 8.3 and the temperature is at least about 30° C. for short probes(e.g. 10 to 50 nucleotides) and at least about 60° C. for long probes(e.g. greater than 50 nucleotides). Stringent conditions may also beachieved with the addition of helix destabilizing agents such asformamide. The hybridization conditions may also vary when a non-ionicbackbone, i.e. PNA is used, as is known in the art. In addition,cross-linking agents may be added after target binding to cross-link,i.e. covalently attach, the two strands of the hybridization complex.

Thus, the assays are generally run under stringency conditions whichallows formation of the hybridization complex only in the presence oftarget. Stringency can be controlled by altering a step parameter thatis a thermodynamic variable, including, but not limited to, temperature,formamide concentration, salt concentration, chaotropic saltconcentration, pH, organic solvent concentration, etc.

These parameters may also be used to control non-specific binding, as isgenerally outlined in U.S. Pat. No. 5,681,697. Thus it may be desirableto perform certain steps at higher stringency conditions to reducenon-specific binding.

The size of the primer nucleic acid may vary, as will be appreciated bythose in the art, in general varying from 5 to 500 nucleotides inlength, with primers of between 10 and 100 being preferred, between 15and 50 being particularly preferred, and from 10 to 35 being especiallypreferred, depending on the use and amplification technique.

In addition, the different amplification techniques may have furtherrequirements of the primers, as is more fully described below.

In addition, as outlined herein, a variety of labeling techniques can bedone.

Labeling Techniques

In general, either direct or indirect detection of the target productscan be done. “Direct” detection as used in this context, as for theother reactions outlined herein, requires the incorporation of a label,in this case a detectable label, preferably an optical label such as afluorophore, into the target sequence, with detection proceeding asoutlined below. In this embodiment, the label(s) may be incorporated ina variety of ways: (1) the primers comprise the label(s), for exampleattached to the base, a ribose, a phosphate, or to analogous structuresin a nucleic acid analog; (2) modified nucleosides are used that aremodified at either the base or the ribose (or to analogous structures ina nucleic acid analog) with the label(s); these label-modifiednucleosides are then converted to the triphosphate form and areincorporated into a newly synthesized strand by a polymerase; (3)modified nucleotides are used that comprise a functional group that canbe used to add a detectable label; (4) modified primers are used thatcomprise a functional group that can be used to add a detectable labelor (5) a label probe that is directly labeled and hybridizes to aportion of the target sequence can be used. Any of these methods resultin a newly synthesized strand or reaction product that comprises labels,that can be directly detected as outlined below.

Thus, the modified strands comprise a detection label. By “detectionlabel” or “detectable label” herein is meant a moiety that allowsdetection. This may be a primary label or a secondary label.Accordingly, detection labels may be primary labels (i.e. directlydetectable) or secondary labels (indirectly detectable).

In a preferred embodiment, the detection label is a primary label. Aprimary label is one that can be directly detected, such as afluorophore. In general, labels fall into three classes: a) isotopiclabels, which may be radioactive or heavy isotopes; b) magnetic,electrical, thermal labels; and c) colored or luminescent dyes. Labelscan also include enzymes (horseradish peroxidase, etc.) and magneticparticles. Preferred labels include chromophores or phosphors but arepreferably fluorescent dyes. Suitable dyes for use in the inventioninclude, but are not limited to, fluorescent lanthanide complexes,including those of Europium and Terbium, fluorescein, rhodamine,tetramethylrhodamine, eosin, erythrosin, coumarin, methyl-coumarins,quantum dots (also referred to as “nanocrystals”: see U.S. Ser. No.09/315,584, hereby incorporated by reference), pyrene, Malacite green,stilbene, Lucifer Yellow, Cascade Blue™, Texas Red, Cy dyes (Cy3, Cy5,etc.), alexa dyes, phycoerythin, bodipy, and others described in the 6thEdition of the Molecular Probes Handbook by Richard P. Haugland, herebyexpressly incorporated by reference.

In a preferred embodiment, a secondary detectable label is used. Asecondary label is one that is indirectly detected; for example, asecondary label can bind or react with a primary label for detection,can act on an additional product to generate a primary label (e.g.enzymes), or may allow the separation of the compound comprising thesecondary label from unlabeled materials, etc. Secondary labels findparticular use in systems requiring separation of labeled and unlabeledprobes, such as SBE, OLA, invasive cleavage reactions, etc; in addition,these techniques may be used with many of the other techniques describedherein. Secondary labels include, but are not limited to, one of abinding partner pair; chemically modifiable moieties; nucleaseinhibitors, enzymes such as horseradish peroxidase, alkalinephosphatases, lucifierases, etc.

In a preferred embodiment, the secondary label is a binding partnerpair. For example, the label may be a hapten or antigen, which will bindits binding partner. In a preferred embodiment, the binding partner canbe attached to a solid support to allow separation of extended andnon-extended primers. For example, suitable binding partner pairsinclude, but are not limited to: antigens (such as proteins (includingpeptides)) and antibodies (including fragments thereof (FAbs, etc.));proteins and small molecules, including biotin/streptavidin; enzymes andsubstrates or inhibitors; other protein-protein interacting pairs;receptor-ligands; and carbohydrates and their binding partners. Nucleicacid-nucleic acid binding proteins pairs are also useful. In general,the smaller of the pair is attached to the NTP for incorporation intothe primer. Preferred binding partner pairs include, but are not limitedto, biotin (or imino-biotin) and streptavidin, digeoxinin and Abs, andProlinx™ reagents (see prolinxinc.com/ie4/home.hmtl).

In a preferred embodiment, the binding partner pair comprises biotin orimino-biotin and streptavidin. Imino-biotin is particularly preferred asimino-biotin disassociates from streptavidin in pH 4.0 buffer whilebiotin requires harsh denaturants (e.g. 6 M guanidinium HCl, pH 1.5 or90% formamide at 95° C.).

In a preferred embodiment, the binding partner pair comprises a primarydetection label (for example, attached to the NTP and therefore to theextended primer) and an antibody that will specifically bind to theprimary detection label. By “specifically bind” herein is meant that thepartners bind with specificity sufficient to differentiate between thepair and other components or contaminants of the system. The bindingshould be sufficient to remain bound under the conditions of the assay,including wash steps to remove non-specific binding. In someembodiments, the dissociation constants of the pair will be less thanabout 10⁻⁴-10⁻⁶M⁻¹, with less than about 10⁻⁵-10⁻⁹M⁻¹ being preferredand less than about 10⁻⁷-10⁻⁹ M⁻¹ being particularly preferred.

In a preferred embodiment, the secondary label is a chemicallymodifiable moiety. In this embodiment, labels comprising reactivefunctional groups are incorporated into the nucleic acid. The functionalgroup can then be subsequently labeled with a primary label. Suitablefunctional groups include, but are not limited to, amino groups, carboxygroups, maleimide groups, oxo groups and thiol groups, with amino groupsand thiol groups being particularly preferred. For example, primarylabels containing amino groups can be attached to secondary labelscomprising amino groups, for example using linkers as are known in theart; for example, homo- or hetero-bifunctional linkers as are well known(see 1994 Pierce Chemical Company catalog, technical section oncross-linkers, pages 155-200, incorporated herein by reference).

For removal of unextended primers, it is preferred that the other halfof the binding pair is attached to a solid support. In this embodiment,the solid support may be any as described herein for substrates and s,and the form is preferably s as well; for example, a preferredembodiment utilizes magnetic beads that can be easily introduced to thesample and easily removed, although any affinity chromatography formatsmay be used as well. Standard methods are used to attach the bindingpartner to the solid support, and can include direct or indirectattachment methods. For example, biotin labeled antibodies tofluorophores can be attached to streptavidin coated magnetic beads.

Thus, in this embodiment, the extended primers comprise a bindingpartner that is contacted with its binding partner under conditionswherein the extended or reacted primers are separated from theunextended or unreacted primers. These modified primers can then beadded to the array comprising capture probes as described herein.

Removal of Unextended Primers

In a preferred embodiment, it is desirable to remove the unextended orunreacted primers from the assay mixture, and particularly from thearray, as unextended primers will compete with the extended (labeled)primers in binding to capture probes, thereby diminishing the signal.The concentration of the unextended primers relative to the extendedprimer may be relatively high, since a large excess of primer is usuallyrequired to generate efficient primer annealing. Accordingly, a numberof different techniques may be used to facilitate the removal ofunextended primers. While the discussion below applies specifically toSBE, these techniques may be used in any of the methods describedherein.

In a preferred embodiment, the NTPs (or, in the case of other methods,one or more of the probes) comprise a secondary detectable label thatcan be used to separate extended and non-extended primers. As outlinedabove, detection labels may be primary labels (i.e. directly detectable)or secondary labels (indirectly detectable). A secondary label is onethat is indirectly detected; for example, a secondary label can bind orreact with a primary label for detection, or may allow the separation ofthe compound comprising the secondary label from unlabeled materials,etc. Secondary labels find particular use in systems requiringseparation of labeled and unlabeled probes, such as SBE, OLA, invasivecleavage, etc. reactions; in addition, these techniques may be used withmany of the other techniques described herein. Secondary labels include,but are not limited to, one of a binding partner pair; chemicallymodifiable moieties; nuclease inhibitors, etc.

In a preferred embodiment, the secondary label is a binding partner pairas outlined above. In a preferred embodiment, the binding partner paircomprises biotin or imino-biotin and streptavidin. Imino-biotin isparticularly preferred when the methods require the later separation ofthe pair, as imino-biotin disassociates from streptavidin in pH 4.0buffer while biotin requires harsh denaturants (e.g. 6 M guanidiniumHCl, pH 1.5 or 90% formamide at 95° C.).

In addition, the use of streptavidin/biotin systems can be used toseparate unreacted and reacted probes (for example in SBE, invasivecleavage, etc.). For example, the addition of streptavidin to a nucleicacid greatly increases its size, as well as changes its physicalproperties, to allow more efficient separation techniques. For example,the mixtures can be size fractionated by exclusion chromatography,affinity chromatography, filtration or differential precipitation.Alternatively, an 3′ exonuclease may be added to a mixture of 3′ labeledbiotin/streptavidin; only the unreacted oligonucleotides will bedegraded. Following exonuclease treatment, the exonuclease and thestreptavidin can be degraded using a protease such as proteinase K. Thesurviving nucleic acids (i.e. those that were biotinylated) are thenhybridized to the array.

In a preferred embodiment, the binding partner pair comprises a primarydetection label (attached to the NTP and therefore to the extendedprimer) and an antibody that will specifically bind to the primarydetection label.

In this embodiment, it is preferred that the other half of the bindingpair is attached to a solid support. In this embodiment, the solidsupport may be any as described herein for substrates and s, and theform is preferably s as well; for example, a preferred embodimentutilizes magnetic beads that can be easily introduced to the sample andeasily removed, although any affinity chromatography formats may be usedas well. Standard methods are used to attach the binding partner to thesolid support, and can include direct or indirect attachment methods.For example, biotin labeled antibodies to fluorophores can be attachedto streptavidin coated magnetic beads.

Thus, in this embodiment, the extended primers comprise a binding memberthat is contacted with its binding partner under conditions wherein theextended primers are separated from the unextended primers. Theseextended primers can then be added to the array comprising captureprobes as described herein.

In a preferred embodiment, the secondary label is a chemicallymodifiable moiety. In this embodiment, labels comprising reactivefunctional groups are incorporated into the nucleic acid.

In a preferred embodiment, the secondary label is a nuclease inhibitor.In this embodiment, the chain-terminating NTPs are chosen to renderextended primers resistant to nucleases, such as 3′-exonucleases.Addition of an exonuclease will digest the non-extended primers leavingonly the extended primers to bind to the capture probes on the array.This may also be done with OLA, wherein the ligated probe will beprotected but the unprotected ligation probe will be digested.

In this embodiment, suitable 3′-exonucleases include, but are notlimited to, exo I, exo III, exo VII, etc.

The present invention provides a variety of amplification reactions thatcan be detected using the arrays of the invention.

Amplification Reactions

In this embodiment, the invention provides compositions and methods forthe detection (and optionally quantification) of products of nucleicacid amplification reactions, using bead arrays for detection of theamplification products. Suitable amplification methods include bothtarget amplification and signal amplification and include, but are notlimited to, polymerase chain reaction (PCR), ligation chain reaction(sometimes referred to as oligonucleotide ligase amplification OLA),cycling probe technology (CPT), strand displacement assay (SDA),transcription mediated amplification (TMA), nucleic acid sequence basedamplification (NASBA), rolling circle amplification (RCA), and invasivecleavage technology. All of these methods require a primer nucleic acid(including nucleic acid analogs) that is hybridized to a target sequenceto form a hybridization complex, and an enzyme is added that in some waymodifies the primer to form a modified primer. For example, PCRgenerally requires two primers, dNTPs and a DNA polymerase; LCR requirestwo primers that adjacently hybridize to the target sequence and aligase; CPT requires one cleavable primer and a cleaving enzyme;invasive cleavage requires two primers and a cleavage enzyme; etc. Thus,in general, a target nucleic acid is added to a reaction mixture thatcomprises the necessary amplification components, and a modified primeris formed.

In general, the modified primer comprises a detectable label, such as afluorescent label, which is either incorporated by the enzyme or presenton the original primer. As required, the unreacted primers are removed,in a variety of ways, as will be appreciated by those in the art andoutlined herein. The hybridization complex is then disassociated, andthe modified primer is detected and optionally quantitated by amicrosphere array. In some cases, the newly modified primer serves as atarget sequence for a secondary reaction, which then produces a numberof amplified strands, which can be detected as outlined herein.

Accordingly, the reaction starts with the addition of a primer nucleicacid to the target sequence which forms a hybridization complex. Oncethe hybridization complex between the primer and the target sequence hasbeen formed, an enzyme, sometimes termed an “amplification enzyme”, isused to modify the primer. As for all the methods outlined herein, theenzymes may be added at any point during the assay, either prior to,during, or after the addition of the primers. The identity of the enzymewill depend on the amplification technique used, as is more fullyoutlined below. Similarly, the modification will depend on theamplification technique, as outlined below.

Once the enzyme has modified the primer to form a modified primer, thehybridization complex is disassociated. In one aspect, dissociation isby modification of the assay conditions. In another aspect, the modifiedprimer no longer hybridizes to the target nucleic acid and dissociates.Either one or both of these aspects can be employed in signal and targetamplification reactions as described below. Generally, the amplificationsteps are repeated for a period of time to allow a number of cycles,depending on the number of copies of the original target sequence andthe sensitivity of detection, with cycles ranging from 1 to thousands,with from 10 to 100 cycles being preferred and from 20 to 50 cyclesbeing especially preferred.

After a suitable time of amplification, unreacted primers are removed,in a variety of ways, as will be appreciated by those in the art anddescribed below, and the hybridization complex is disassociated. Ingeneral, the modified primer comprises a detectable label, such as afluorescent label, which is either incorporated by the enzyme or presenton the original primer, and the modified primer is added to amicrosphere array such is generally described in U.S. Ser. Nos.09/189,543; 08/944,850; 09/033,462; 09/287,573; 09/151,877; 09/187,289and 09/256,943; and peT applications US98/09163 and US99/14387;US98/21193; US99/04473 and US98/05025, all of which are herebyincorporated by reference. The microsphere array comprisessubpopulations of s that comprise capture probes that will hybridize tothe modified primers. Detection proceeds via detection of the label asan indication of the presence, absence or amount of the target sequence,as is more fully outlined below.

Target Amplification

In a preferred embodiment, the amplification is target amplification.Target amplification involves the amplification (replication) of thetarget sequence to be detected, such that the number of copies of thetarget sequence is increased. Suitable target amplification techniquesinclude, but are not limited to, the polymerase chain reaction (PCR),strand displacement amplification (SDA), transcription mediatedamplification (TMA) and nucleic acid sequence based amplification(NASBA).

Polymerase Chain Reaction Amplification

In a preferred embodiment, the target amplification technique is PCR.The polymerase chain reaction (PCR) is widely used and described, andinvolves the use of primer extension combined with thermal cycling toamplify a target sequence; see U.S. Pat. Nos. 4,683,195 and 4,683,202,and PCR Essential Data, J. W. Wiley & sons, Ed. c. R. Newton, 1995, allof which are incorporated by reference. In addition, there are a numberof variations of PCR which also find use in the invention, including“quantitative competitive PCR” or “QC-PCR”, “arbitrarily primed PCR” or“AP-PCR”, “immuno-PCR”, “Alu-POR”, “PCR single strand conformationalpolymorphism” or “PCR-SSCP”, “reverse transcriptase PCR” qr “RT-PCR”,“biotin capture PCR”, “vectorette PCR”, “panhandle PCR”, and “PCR selectcDNA subtraction”, “allele-specific PCR”, among others. In someembodiments, PCR is not preferred.

In general, PCR may be briefly described as follows. A double strandedtarget nucleic acid is denatured, generally by raising the temperature,and then cooled in the presence of an excess of a PCR primer, which thenhybridizes to the first target strand. A DNA polymerase then acts toextend the primer with dNTPs, resulting in the synthesis of a new strandforming a hybridization complex. The sample is then heated again, todisassociate the hybridization complex, and the process is repeated. Byusing a second PCR primer for the complementary target strand, rapid andexponential amplification occurs. Thus PCR steps are denaturation,annealing and extension. The particulars of PCR are well known, andinclude the use of a thermostable polymerase such as Taq I polymeraseand thermal cycling.

Accordingly, the PCR reaction requires at least one PCR primer, apolymerase, and a set of dNTPs. As outlined herein, the primers maycomprise the label, or one or more of the dNTPs may comprise a label.

In general, as is more fully outlined below, the capture probes on thebeads of the array are designed to be substantially complementary to theextended part of the primer; that is, unextended primers will not bindto the capture probes. Alternatively, as further described below,unreacted probes may be removed prior to addition to the array.

Strand Displacement Amplification (SDA)

In a preferred embodiment, the target amplification technique is SDA.Strand displacement amplification (SDA) is generally described in Walkeret al., in Molecular Methods for Virus Detection, Academic Press, Inc.,1995, and U.S. Pat. Nos. 5,455,166 and 5,130,238, all of which arehereby expressly incorporated by reference in their entirety.

In general, SDA may be described as follows. A single stranded targetnucleic acid, usually a DNA target sequence, is contacted with an SDAprimer. An “SDA primer” generally has a length of 25-100 nucleotides,with SDA primers of approximately 35 nucleotides being preferred. An SDAprimer is substantially complementary to a region at the 3′ end of thetarget sequence, and the primer has a sequence at its 5′ end (outside ofthe region that is complementary to the target) that is a recognitionsequence for a restriction endonuclease, sometimes referred to herein asa “nicking enzyme” or a “nicking endonuclease”, as outlined below. TheSDA primer then hybridizes to the target sequence. The SDA reactionmixture also contains a polymerase (an “SDA polymerase”, as outlinedbelow) and a mixture of all four deoxynucleoside-triphosphates (alsocalled deoxynucleotides or dNTPs, i.e. dATP, dTTP, dCTP and dGTP), atleast one species of which is a substituted or modified dNTP; thus, theSDA primer is modified, i.e. extended, to form a modified primer,sometimes referred to herein as a “newly synthesized strand”. Thesubstituted dNTP is modified such that it will inhibit cleavage in thestrand containing the substituted dNTP but will not inhibit cleavage onthe other strand. Examples of suitable substituted dNTPs include, butare not limited, 2′ deoxyadenosine 5′-0-(1-thiotriphosphate),5-methyideoxycytidine 5′-triphosphate, 2′-deoxyuridine 5′-triphosphate,and 7-deaza-2′-deoxyguanosine 5′-triphosphate. In addition, thesubstitution of the dNTP may occur after Incorporation into a newlysynthesized strand; for example, a methylase may be used to add methylgroups to the synthesized strand. In addition, if all the nucleotidesare substituted, the polymerase may have 5′-3′ exonuclease activity.However, if less than all the nucleotides are substituted, thepolymerase preferably lacks 5′-3′ exonuclease activity.

As will be appreciated by those in the art, the recognitionsite/endonuclease pair can be any of a wide variety of knowncombinations. The endonuclease is chosen to cleave a strand either atthe recognition site, or either 3′ or 5′ to it, without cleaving thecomplementary sequence, either because the enzyme only cleaves onestrand or because of the incorporation of the substituted nucleotides.Suitable recognition site/endonuclease pairs are well known in the art;suitable endonucleases include, but are not limited to, HincII, HindII,AvaI, Fnu4HI, TthIIII, NcII, BstXI, BamHI, etc. A chart depictingsuitable enzymes, and their corresponding recognition sites and themodified dNTP to use is found in U.S. Pat. No. 5,455,166, herebyexpressly incorporated by reference.

Once nicked, a polymerase (an “SDA polymerase”) is used to extend thenewly nicked strand, 5′-3′, thereby creating another newly synthesizedstrand. The polymerase chosen should be able to initiate 5′, 3′polymerization at a nick site, should also displace the polymerizedstrand downstream from the nick, and should lack 5′-′3′ exonucleaseactivity (this may be additionally accomplished by the addition of ablocking agent). Thus, suitable polymerases in SDA include, but are notlimited to, the Klenow fragment of DNA polymerase I, SEQUENASE 1.0 andSEQUENASE 2.0 (U.S. Biochemical), T5 DNA polymerase and Phi29 DNApolymerase.

Accordingly, the SDA reaction requires, in no particular order, an SDAprimer, an SDA polymerase, a nicking endonuclease, and dNTPs, at leastone species of which is modified. Again, as outlined above for peR,preferred embodiments utilize capture probes complementary to the newlysynthesized portion of the primer, rather than the primer region, toallow unextended primers to be removed.

In general, SDA does not require thermocycling. The temperature of thereaction is generally set to be high enough to prevent non-specifichybridization but low enough to allow specific hybridization; this isgenerally from about 37° C. to about 42° C., depending on the enzymes.

In a preferred embodiment, as for most of the amplification techniquesdescribed herein, a second amplification reaction can be done using thecomplementary target sequence, resulting in a substantial increase inamplification during a set period of time. That is, a second primernucleic acid is hybridized to a second target sequence, that issubstantially complementary to the first target sequence, to form asecond hybridization complex. The addition of the enzyme, followed bydisassociation of the second hybridization complex, results in thegeneration of a number of newly synthesized second strands.

Nucleic Acid Sequence Based Amplification (NASBA) and TranscriptionMediated Amplification (TMA)

In a preferred embodiment, the target amplification technique is nucleicacid sequence based amplification (NASBA). NASBA is generally describedin U.S. Pat. No. 5,409,818; Sooknanan et al., Nucleic AcidSequence-Based Amplification, Ch. 12 (pp. 261-285) of Molecular Methodsfor Virus Detection, Academic Press, 1995; and “Profiting fromGene-based Diagnostics”, CTB International Publishing Inc., N.J., 1996,all of which are incorporated by reference. NASBA is very similar toboth TMA and QBR. Transcription mediated amplification (TMA) isgenerally described in U.S. Pat. Nos. 5,399,491, 5,888,779, 5,705,365,5,710,029, all of which are incorporated by reference. The maindifference between NASBA and TMA is that NASBA utilizes the addition ofRNAse H to effect RNA degradation, and TMA relies on inherent RNAse Hactivity of the reverse transcriptase.

In general, these techniques may be described as follows. A singlestranded target nucleic acid, usually an RNA target sequence (sometimesreferred to herein as “the first target sequence” or “the firsttemplate”), is contacted with a first primer, generally referred toherein as a “NASBA primer” (although “TMA primer” is also suitable).Starting with a DNA target sequence is described below. These primersgenerally have a length of 25-100 nucleotides, with NASBA primers ofapproximately 50-75 nucleotides being preferred. The first primer ispreferably a DNA primer that has at its 3′ end a sequence that issubstantially complementary to the 3′ end of the first template. Thefirst primer also has an RNA polymerase promoter at its 5′ end (or itscomplement (antisense), depending on the configuration of the system).The first primer is then hybridized to the first template to form afirst hybridization complex. The reaction mixture also includes areverse transcriptase enzyme (an “NASBA reverse transcriptase”) and amixture of the four dNTPs, such that the first NASBA primer is modified,i.e. extended, to form a modified first primer, comprising ahybridization complex of RNA (the first template) and DNA (the newlysynthesized strand).

By “reverse transcriptase” or “RNA-directed DNA polymerase” herein ismeant an enzyme capable of synthesizing DNA from a DNA primer and an RNAtemplate. Suitable RNA-directed DNA polymerases include, but are notlimited to, avian myloblastosis virus reverse transcriptase (“AMV RT”)and the Moloney murine leukemia virus RT. When the amplificationreaction is TMA, the reverse transcriptase enzyme further comprises aRNA degrading activity as outlined below.

In addition to the components listed above, the NASBA reaction alsoincludes an RNA degrading enzyme, also sometimes referred to herein as aribonuclease, that will hydrolyze RNA of an RNA:DNA hybrid withouthydrolyzing single- or double-stranded RNA or DNA. Suitableribonucleases include, but are not limited to, RNase H from E. coli andcalf thymus.

The ribonuclease activity degrades the first RNA template in thehybridization complex, resulting in a disassociation of thehybridization complex leaving a first single stranded newly synthesizedDNA strand, sometimes referred to herein as “the second template”.

In addition, the NASBA reaction also includes a second NASBA primer,generally comprising DNA (although as for all the probes herein,including primers, nucleic acid analogs may also be used). This secondNASBA primer has a sequence at its 3′ end that is substantiallycomplementary to the 3′ end of the second template, and also contains anantisense sequence for a functional promoter and the antisense sequenceof a transcription initiation site. Thus, this primer sequence, whenused as a. template for synthesis of the third DNA template, containssufficient information to allow specific and efficient binding of an RNApolymerase and initiation of transcription at the desired site.Preferred embodiments utilizes the antisense promoter and transcriptioninitiation site are that of the T7 RNA polymerase, although other RNApolymerase promoters and initiation sites can be used as well, asoutlined below.

The second primer hybridizes to the second template, and a DNApolymerase, also termed a “DNA-directed DNA polymerase”, also present inthe reaction, synthesizes a third template (a second newly synthesizedDNA strand), resulting in second hybridization complex comprising twonewly synthesized DNA strands.

Finally, the inclusion of an RNA polymerase and the required fourribonucleoside triphosphates (ribonucleotides or NTPs) results in thesynthesis of an RNA strand (a third newly synthesized strand that isessentially the same as the first template). The RNA polymerase,sometimes referred to herein as a “DNA-directed RNA polymerase”,recognizes the promoter and specifically initiates RNA synthesis at theinitiation site. In addition, the RNA polymerase preferably synthesizesseveral copies of RNA per DNA duplex. Preferred RNA polymerases include,but are not limited to, T7 RNA polymerase, and other bacteriophage RNApolymerases including those of phage T3, phage cpU, Salmonella phagesp6, or Pseudomonase phage gh-1.

In some embodiments, TMA and NASBA are used with starting DNA targetsequences. In this embodiment, it is necessary to utilize the firstprimer comprising the RNA polymerase promoter and a DNA polymeraseenzyme to generate a double stranded DNA hybrid with the newlysynthesized strand comprising the promoter sequence. The hybrid is thendenatured and the second primer added.

Accordingly, the NASBA reaction requires, in no particular order, afirst NASBA primer, a second NASBA primer comprising an antisensesequence of an RNA polymerase promoter, an RNA polymerase thatrecognizes the promoter, a reverse transcriptase, a DNA polymerase, anRNA degrading enzyme, NTPs and dNTPs, in addition to the detectioncomponents outlined below.

These components result in a single starting RNA template generating asingle DNA duplex; however, since this DNA duplex results in thecreation of multiple RNA strands, which can then be used to initiate thereaction again, amplification proceeds rapidly.

Accordingly, the TMA reaction requires, in no particular order, a firstTMA primer, a second TMA primer comprising an antisense sequence of anRNA polymerase promoter, an RNA polymerase that recognizes the promoter,a reverse transcriptase with RNA degrading activity, a DNA polymerase,NTPs and dNTPs, in addition to the detection components outlined below.

These components result in a single starting RNA template generating asingle DNA duplex; however, since this DNA duplex results in thecreation of multiple RNA strands, which can then be used to initiate thereaction again, amplification proceeds rapidly.

As outlined herein, the detection of the newly synthesized strands canproceed in several ways. Direct detection can be done when the newlysynthesized strands comprise detectable labels, either by incorporationinto the primers or by incorporation of modified labelled nucleotidesinto the growing strand. Alternatively, as is more fully outlined below,indirect detection of unlabeled strands (which now serve as “targets” inthe detection mode) can occur using a variety of sandwich assayconfigurations. As will be appreciated by those in the art, any of thenewly synthesized strands can serve as the “target” for form an assaycomplex on a surface with a capture probe. In NASBA and TMA, it ispreferable to utilize the newly formed RNA strands as the target, asthis is where significant amplification occurs.

In this way, a number of secondary target molecules are made. As is morefully outlined below, these reactions (that is, the products of thesereactions) can be detected in a number of ways.

Signal Amplification Techniques

In a preferred embodiment, the amplification technique is signalamplification. Signal amplification involves the use of limited numberof target molecules as templates to either generate multiple signalingprobes or allow the use of multiple signaling probes. Signalamplification strategies include LCR, CPT, Q-3R, invasive cleavagetechnology, and the use of amplification probes in sandwich assays.

Single Base Extension (SBE)

In a preferred embodiment, single base extension (SBE; sometimesreferred to as “minisequencing”) is used for amplification. It shouldalso be noted that SBE finds use in genotyping, as is described below.Briefly, SBE is a technique that utilizes an extension primer thathybridizes to the target nucleic acid. A polymerase (generally a DNApolymerase) is used to extend the 3′ end of the primer with a nucleotideanalog labeled a detection label as described herein. Based on thefidelity of the enzyme, a nucleotide is only incorporated into theextension primer if it is complementary to the adjacent base in thetarget strand. Generally, the nucleotide is derivatized such that nofurther extensions can occur, so only a single nucleotide is added.However, for amplification reactions, this may not be necessary. Oncethe labeled nucleotide is added, detection of the label proceeds asoutlined herein. See generally Sylvanen et al., Genomics 8:684-692(1990); U.S. Pat. Nos. 5,846,710 and 5,888,819; Pastinen et al.,Genomics Res. 7(6):606-614 (1997); all of which are expresslyincorporated herein by reference.

The reaction is initiated by introducing the assay complex comprisingthe target sequence (i.e. the array) to a solution comprising a firstnucleotide, frequently an nucleotide analog. By “nucleotide analog” inthis context herein is meant a deoxynucleoside-triphosphate (also calleddeoxynucleotides or dNTPs, i.e. dATP, dTTP, dCTP and dGTP), that isfurther derivatized to be chain terminating. As will be appreciated bythose in the art, any number of nucleotide analogs may be used, as longas a polymerase enzyme will still incorporate the nucleotide at theinterrogation position. Preferred embodiments utilizedideoxy-triphosphate nucleotides (ddNTPs). Generally, a set ofnucleotides comprising ddATP, ddCTP, ddGTP and ddTTP is used, at leastone of which includes a label, and preferably all four. Foramplification rather than genotyping reactions, the labels may all bethe same; alternatively, different labels may be used.

In a preferred embodiment, the nucleotide analogs comprise a detectablelabel, which can be either a primary or secondary detectable label.Preferred primary labels are those outlined above. However, theenzymatic incorporation of nucleotides comprising fluorophores is poorunder many conditions; accordingly, preferred embodiments utilizesecondary detectable labels. In addition, as outlined below, the use ofsecondary labels may also facilitate the removal of unextended probes.

In addition to a first nucleotide, the solution also comprises anextension enzyme, generally a DNA polymerase. Suitable DNA polymerasesinclude, but are not limited to, the Klenow fragment of DNA polymerase1, SEQUENASE 1.0 and SEQUENASE 2.0 (U.S. Biochemical), T5 DNA polymeraseand Phi29 DNA polymerase. If the NTP is complementary to the base of thedetection position of the target sequence, which is adjacent to theextension primer, the extension enzyme will add it to the extensionprimer. Thus, the extension primer is modified, i.e. extended, to form amodified primer, sometimes referred to herein as a “newly synthesizedstrand”.

A limitation of this method is that unless the target nucleic acid is insufficient concentration, the amount of unextended primer in thereaction greatly exceeds the resultant extended-labeled primer. Theexcess of unextended primer competes with the detection of the labeledprimer in the assays described herein. Accordingly, when SBE is used,preferred embodiments utilize methods for the removal of unextendedprimers as outlined herein.

One method to overcome this limitation is thermo cycling minisequencingin which repeated cycles of annealing, primer extension, and heatdenaturation using a thermo cycler and thermo-stable polymerase allowsthe amplification of the extension probe which results in theaccumulation of extended primers. For example, if the originalunextended primer to target nucleic acid concentration is 100:1 and 100thermocycles and extensions are performed, a majority of the primer willbe extended.

As will be appreciated by those in the art, the configuration of the SBEsystem can take on several forms. As for the LCR reaction describedbelow, the reaction may be done in solution, and then the newlysynthesized strands, with the base-specific detectable labels, can bedetected. For example, they can be directly hybridized to capture probesthat are complementary to the extension primers, and the presence of thelabel is then detected.

Alternatively, the SBE reaction can occur on a surface. For example, atarget nucleic acid may be captured using a first capture probe thathybridizes to a first target domain of the target, and the reaction canproceed at a second target domain. The extended labeled primers are thenbound to a second capture probe and detected.

Thus, the SBE reaction requires, in no particular order, an extensionprimer, a polymerase and dNTPs, at least one of which is labeled.

Oligonucleotide Ligation Amplification (OLA)

In a preferred embodiment, the signal amplification technique is OLA.OLA, which is referred to as the ligation chain reaction (LCR) whentwo-stranded substrates are used, involves the ligation of two smallerprobes into a single long probe, using the target sequence as thetemplate. In LCR, the ligated probe product becomes the predominanttemplate as the reaction progresses. The method can be run in twodifferent ways; in a first embodiment, only one strand of a targetsequence is used as a template for ligation; alternatively, both strandsmay be used. See generally U.S. Pat. Nos. 5,185,243, 5,679,524 and5,573,907; EP 0 320 308 B1; EP 0336 731 B1; EP 0 439 182 131; WO90/01069; WO 89/12696; WO 97/31256; and WO 89/09835, and U.S. Ser. Nos.60/078,102 and 60/073,011, all of which are incorporated by reference.

In a preferred embodiment, the single-stranded target sequence comprisesa first target domain and a second target domain, which are adjacent andcontiguous. A first OLA primer and a second OLA primer nucleic acids areadded, that are substantially complementary to their respective targetdomain and thus will hybridize to the target domains. These targetdomains may be directly adjacent, i.e. contiguous, or separated by anumber of nucleotides. If they are non-contiguous, nucleotides are addedalong with means to join nucleotides, such as a polymerase, that willadd the nucleotides to one of the primers. The two OLA primers are thencovalently attached, for example using a ligase enzyme such as is knownin the art, to form a modified primer. This forms a first hybridizationcomplex comprising the ligated probe and the target sequence. Thishybridization complex is then denatured (disassociated), and the processis repeated to generate a pool of ligated probes.

In a preferred embodiment, OLA is done for two strands of adouble-stranded target sequence. The target sequence is denatured, andtwo sets of probes are added: one set as outlined above for one strandof the target, and a separate set (i.e. third and fourth primer probenucleic acids) for the other strand of the target. In a preferredembodiment, the first and third probes will hybridize, and the secondand fourth probes will hybridize, such that amplification can occur.That is, when the first and second probes have been attached, theligated probe can now be used as a template, in addition to the secondtarget sequence, for the attachment of the third and fourth probes.Similarly, the ligated third and fourth probes will serve as a templatefor the attachment of the first and second probes, in addition to thefirst target strand. In this way, an exponential, rather than just alinear, amplification can occur.

As will be appreciated by those in the art, the ligation product can bedetected in a variety of ways. In a preferred embodiment, the ligationreaction is run in solution. In this embodiment, only one of the primerscarries a detectable label, e.g. the first ligation probe, and thecapture probe on the bead is substantially complementary to the otherprobe, e.g. the second ligation probe. In this way, unextended labeledligation primers will not interfere with the assay. That is, in apreferred embodiment, the ligation product is detected by solid-phaseoligonucleotide probes. The solid-phase probes are preferablycomplementary to at least a portion of the ligation product. In apreferred embodiment, the solid-phase probe is complementary to the 5′detection oligonucleotide portion of the ligation product. Thissubstantially reduces or eliminates false signal generated by theoptically-labeled 3′ primers. Preferably, detection is accomplished byremoving the unligated 5′ detection oligonucleotide from the reactionbefore application to a capture probe. In one embodiment, the unligated5′ detection oligonucleotides are removed by digesting 3 non-protectedoligonucleotides with a 3′ exonuclease, such as, exonuclease I. Theligation products are protected from exo I digestion by including, forexample, 4-phosphorothioate residues at their 3′ terminus, thereby,rendering them resistant to exonuclease digestion. The unligateddetection oligonucleotides are not protected and are digested.

Alternatively, the target nucleic acid is immobilized on a solid-phasesurface. The ligation assay is performed and unligated oligonucleotidesare removed by washing under appropriate stringency to remove unligatedoligonucleotides. The ligated oligonucleotides are eluted from thetarget nucleic acid using denaturing conditions, such as, 0.1 N NaOH,and detected as described herein.

Again, as outlined above, the detection of the LCR reaction can alsooccur directly, In the case where one or both of the primers comprisesat least one detectable label, or indirectly, using sandwich assays,through the use of additional probes; that is, the ligated probes canserve as target sequences, and detection may utilize amplificationprobes, capture probes, capture extender probes, label probes, and labelextender probes, etc.

Rolling-Circle Amplification (RCA)

In a preferred embodiment the signal amplification technique is RCA.Rolling-circle amplification is generally described in Baner et al.(1998) Nuc. Acids Res. 26:5073-5078; Barany, F. (1991) Proc. Na! L Acad.ScL USA 88: 189-193; and Lizardi et al. (1998) Nat. Genet. 19:225-232,all of which are incorporated by reference in their entirety.

In general, RCA may be described in two ways. First, as is outlined inmore detail below, a single probe is hybridized with a target nucleicacid. Each terminus of the probe hybridizes adjacently on the targetnucleic acid and the OLA assay as described above occurs. When ligated,the probe is circularized while hybridized to the target nucleic acid.Addition of a polymerase results in extension of the circular probe.However, since the probe has no terminus, the polymerase continues toextend the probe repeatedly. Thus results in amplification of thecircular probe.

A second alternative approach involves OLA followed by RCA. In thisembodiment, an immobilized primer is contacted with a target nucleicacid. Complementary sequences will hybridize with each other resultingin an immobilized duplex. A second primer is contacted with the targetnucleic acid. The second primer hybridizes to the target nucleic acidadjacent to the first primer. An OLA assay is performed as describedabove. Ligation only occurs if the primer are complementary to thetarget nucleic acid. When a mismatch occurs, particularly at one of thenucleotides to be ligated, ligation will not occur. Following ligationof the oligonucleotides, the ligated, immobilized, oligonucleotide isthen hybridized with an RCA probe. This is a circular probe that isdesigned to specifically hybridize with the ligated oligonucleotide andwill only hybridize with an oligonucleotide that has undergone ligation.RCA is then performed as is outlined in more detail below.

Accordingly, in an preferred embodiment, a single oligonucleotide isused both for OLA and as the circular template for RCA (referred toherein as a “padlock probe” or a “RCA probe”). That is, each terminus ofthe oligonucleotide contains sequence complementary to the targetnucleic acid and functions as an OLA primer as described above. That is,the first end of the RCA probe is substantially complementary to a firsttarget domain, and the second end of the RCA probe is substantiallycomplementary to a second target domain, adjacent to the first domain.Hybridization of the oligonucleotide to the target nucleic acid resultsin the formation of a hybridization complex. Ligation of the “primers”(which are the discrete ends of a single oligonucleotide) results in theformation of a modified hybridization complex containing a circularprobe i.e. an RCA template complex. That is, the oligonucleotide iscircularized while still hybridized with the target nucleic acid. Thisserves as a circular template for RCA. Addition of a polymerase to theRCA template complex results in the formation of an amplified productnucleic acid. Following RCA, the amplified product nucleic acid isdetected (FIGS. 6A and 6B). This can be accomplished in a variety ofways; for example, the polymerase may incorporate labelled nucleotides,or alternatively, a label probe is used that is substantiallycomplementary to a portion of the RCA probe and comprises at least onelabel is used.

The polymerase can be any polymerase, but is preferably one lacking 3′exonuclease activity (3′ exo′). Examples of suitable polymerase includebut are not limited to exonuclease minus DNA Polymerase I large (Klenow)Fragment, Phi29 DNA polymerase, Taq DNA Polymerase and the like. Inaddition, in some embodiments, a polymerase that will replicatesingle-stranded DNA (i.e. without a primer forming a double strandedsection) can be used.

In a preferred embodiment, the RCA probe contains an adapter sequence asoutlined herein, with adapter capture probes on the array, for exampleon a microsphere when microsphere arrays are being used. Alternatively,unique portions of the RCA probes, for example all or part of thesequence corresponding to the target sequence, can be used to bind to acapture probe.

In a preferred embodiment, the padlock probe contains a restrictionsite. The restriction endonuclease site allows for cleavage of the longconcatamers that are typically the result of RCA into smaller individualunits that hybridize either more efficiently or faster to surface boundcapture probes. Thus, following RCA, the product nucleic acid iscontacted with the appropriate restriction endonuclease. This results incleavage of the product nucleic acid into smaller fragments. Thefragments are then hybridized with the capture probe that is immobilizedresulting in a concentration of product fragments onto the microsphere.Again, as outlined herein, these fragments can be detected in one of twoways: either labelled nucleotides are incorporated during thereplication step, or an additional label probe is added.

Thus, in a preferred embodiment, the padlock probe comprises a labelsequence; i.e. a sequence that can be used to bind label probes and issubstantially complementary to a label probe. In one embodiment, it ispossible to use the same label sequence and label probe for all padlockprobes on an array; alternatively, each padlock probe can have adifferent label sequence.

The padlock probe also contains a priming site for priming the RCAreaction. That is, each padlock probe comprises a sequence to which aprimer nucleic acid hybridizes forming a template for the polymerase.The primer can be found in any portion of the circular probe. In apreferred embodiment, the primer is located at a discrete site in theprobe. In this embodiment, the primer site in each distinct padlockprobe is identical, although this is not required. Advantages of usingprimer sites with identical sequences include the ability to use only asingle primer oligonucleotide to prime the RCA assay with a plurality ofdifferent hybridization complexes. That is, the padlock probe hybridizesuniquely to the target nucleic acid to which it is designed. A singleprimer hybridizes to all of the unique hybridization complexes forming apriming site for the polymerase. RCA then proceeds from an identicallocus within each unique padlock probe of the hybridization complexes.

In an alternative embodiment, the primer site can overlap, encompass, orreside within any of the above-described elements of the padlock probe.That is, the primer can be found, for example, overlapping or within therestriction site or the identifier sequence. In this embodiment, it isnecessary that the primer nucleic acid is designed to base pair with thechosen primer site.

Thus, the padlock probe of the invention contains at each terminus,sequences corresponding to OLA primers. The intervening sequence of thepadlock probe contain in no particular order, an adapter sequence and arestriction endonuclease site. In addition, the padlock probe contains aRCA priming site.

Thus, in a preferred embodiment the OLA/RCA is performed in solutionfollowed by restriction endonuclease cleavage of the RCA product. Thecleaved product is then applied to an array comprising beads, each beadcomprising a probe complementary to the adapter sequence located in thepadlock probe. The amplified adapter sequence correlates with aparticular target nucleic acid. Thus the incorporation of anendonuclease site allows the generation of short, easily hybridizablesequences. Furthermore, the unique adapter sequence in each rollingcircle padlock probe sequence allows diverse sets of nucleic acidsequences to be analyzed in parallel on an array, since each sequence isresolved on the basis of hybridization specificity.

In an alternative OLA/RCA method, one of the OLA primers is immobilizedon the microsphere; the second primer is added in solution. Both primershybridize with the target nucleic acid forming a hybridization complexas described above for the OLA assay.

As described herein, the microsphere is distributed on an array. In apreferred embodiment, a plurality of s each with a unique OLA primer isdistributed on the array.

Following the OLA assay, and either before, after or concurrently withdistribution of the beads on the array, a segment of circular DNA ishybridized to the bead-based ligated oligonucleotide forming a modifiedhybridization complex. Addition of an appropriate polymerase (3′ exo′),as is known in the art, and corresponding reaction buffer to the arrayleads to amplification of the circular DNA. Since there is no terminusto the circular DNA, the polymerase continues to travel around thecircular template generating extension product until it detaches fromthe template. Thus, a polymerase with high processivity can createseveral hundred or thousand copies of the circular template with all thecopies linked in one contiguous strand.

Again, these copies are subsequently detected by one of two methods;either hybridizing a labeled oligo complementary to the circular targetor via the incorporation of labeled nucleotides in the amplificationreaction. The label is detected using conventional label detectionmethods as described herein.

In one embodiment, when the circular DNA contains sequencescomplementary to the ligated oligonucleotide it is preferable to removethe target DNA prior to contacting the ligated oligonucleotide with thecircular DNA (See FIG. 7). This is done by denaturing thedouble-stranded DNA by methods known in the art. In an alternativeembodiment, the double stranded DNA is not denatured prior to contactingthe circular DNA.

In an alternative embodiment, when the circular DNA contains sequencescomplementary to the target nucleic acid, it is preferable that thecircular DNA is complementary at a site distinct from the site bound tothe ligated oligonucleotide. In this embodiment it is preferred that theduplex between the ligated oligonucleotide and target nucleic acid isnot denatured or disrupted prior to the addition of the circular DNA sothat the target DNA remains immobilized to the bead.

Hybridization and washing conditions are well known in the art; variousdegrees of stringency can be used. In some embodiments it is notnecessary to use stringent hybridization or washing conditions as only scontaining the ligated probes will effectively hybridize with thecircular DNA; s bound to DNA that did not undergo ligation (thosewithout the appropriate target nucleic acid) will not hybridize asstrongly with the circular DNA as those primers that were ligated. Thus,hybridization and/or washing conditions are used that discriminatebetween binding of the circular DNA to the ligated primer and theunligated primer.

Alternatively, when the circular probe is designed to hybridize to thetarget nucleic acid at a site distinct from the site bound to theligated oligonucleotide, hybridization and washing conditions are usedto remove or dissociate the target nucleic acid from unligatedoligonucleotides while target nucleic acid hybridizing with the ligatedoligonucleotides will remain bound to the beads. In this embodiment, thecircular probe only hybridizes to the target nucleic acid when thetarget nucleic acid is hybridized with a ligated oligonucleotide that isimmobilized on a bead.

As is well known in the art, an appropriate polymerase (3′ exo) is addedto the array. The polymerase extends the sequence of a single-strandedDNA using double-stranded DNA as a primer site. In one embodiment, thecircular DNA that has hybridized with the appropriate OLA reactionproduct serves as the primer for the polymerase. In the presence of anappropriate reaction buffer as is known in the art, the polymerase willextend the sequence of the primer using the single-stranded circular DNAas a template. As there is no terminus of the circular DNA, thepolymerase will continue to extend the sequence of the circular DNA. Inan alternative embodiment, the RCA probe comprises a discrete primersite located within the circular probe. Hybridization of primer nucleicacids to this primer site forms the polymerase template allowing RCA toproceed.

In a preferred embodiment, the polymerase creates more than 100 copiesof the circular DNA. In more preferred embodiments the polymerasecreates more than 1000 copies of the circular DNA; while in a mostpreferred embodiment the polymerase creates more than 10,000 copies ormore than 50,000 copies of the template.

The amplified circular DNA sequence is then detected by methods known inthe art and as described herein. Detection is accomplished byhybridizing with a labeled probe. The probe is labeled directly orindirectly. Alternatively, labeled nucleotides are incorporated into theamplified circular DNA product. The nucleotides can be labeled directly,or indirectly as is further described herein.

The RCA as described herein finds use in allowing highly specific andhighly sensitive detection of nucleic acid target sequences. Inparticular, the method finds use in improving the multiplexing abilityof DNA arrays and eliminating costly sample or target preparation. As anexample, a substantial savings in cost can be realized by directlyanalyzing genomic DNA on an array, rather than employing an intermediatePCR amplification step. The method finds use in examining genomic DNAand other samples including mRNA.

In addition the RCA finds use in allowing rolling circle amplificationproducts to be easily detected by hybridization to probes in asolid-phase format (e.g. an array of beads). An additional advantage ofthe RCA is that it provides the capability of multiplex analysis so thatlarge numbers of sequences can be analyzed in parallel. By combining thesensitivity of RCA and parallel detection on arrays, many sequences canbe analyzed directly from genomic DNA.

Chemical Ligation Techniques

A variation of LCR utilizes a “chemical ligation” of sorts, as isgenerally outlined in U.S. Pat. Nos. 5,616,464 and 5,767,259, both ofwhich are hereby expressly incorporated by reference in their entirety.In this embodiment, similar to enzymatic ligation, a pair of primers areutilized, wherein the first primer is substantially complementary to afirst domain of the target and the second primer is substantiallycomplementary to an adjacent second domain of the target (although, asfor enzymatic ligation, if a “gap” exists, a polymerase and dNTPs may beadded to “fill in” the gap). Each primer has a portion that acts as a“side chain” that does not bind the target sequence and acts as one halfof a stem structure that interacts non-covalently through hydrogenbonding, salt bridges, van der Waal's forces, etc. Preferred embodimentsutilize substantially complementary nucleic acids as the side chains.Thus, upon hybridization of the primers to the target sequence, the sidechains of the primers are brought into spatial proximity, and, if theside chains comprise nucleic acids as well, can also form side chainhybridization complexes.

At least one of the side chains of the primers comprises an activatablecross-linking agent, generally covalently attached to the side chain,that upon activation, results in a chemical cross-link or chemicalligation. The activatable group may comprise any moiety that will allowcross-linking of the side chains, and include groups activatedchemically, photonically and thermally, with photoactivatable groupsbeing preferred. In some embodiments a single activatable group on oneof the side chains is enough to result in cross-linking via interactionto a functional group on the other side chain; in alternate embodiments,activatable groups are required on each side chain.

Once the hybridization complex is formed, and the cross-linking agenthas been activated such that the primers have been covalently attached,the reaction is subjected to conditions to allow for the disassociationof the hybridization complex, thus freeing up the target to serve as atemplate for the next ligation or cross-linking. In this way, signalamplification occurs, and can be detected as outlined herein.

Invasive Cleavage Techniques

In a preferred embodiment, the signal amplification technique isinvasive cleavage technology, which is described in a number of patentsand patent applications, including U.S. Pat. Nos. 5,846,717; 5,614,402;5,719,028; 5,541,311; and 5,843,669, all of which are herebyincorporated by reference in their entirety. Invasive cleavagetechnology is based on structure-specific nucleases that cleave nucleicacids in a site-specific manner. Two probes are used: an “invader” probeand a “signaling” probe, that adjacently hybridize to a target sequencewith overlap. For mismatch discrimination, the invader technology relieson complementarity at the overlap position where cleavage occurs. Theenzyme cleaves at the overlap, and releases the “tail” which may or maynot be labeled. This can then be detected.

Generally, invasive cleavage technology may be described as follows. Atarget nucleic acid is recognized by two distinct probes. A first probe,generally referred to herein as an “invader” probe, is substantiallycomplementary to a first portion of the target nucleic acid. A secondprobe, generally referred to herein as a “signal probe”, is partiallycomplementary to the target nucleic acid; the 3′ end of the signaloligonucleotide is substantially complementary to the target sequencewhile the 5′ end is non-complementary and preferably forms asingle-stranded “tail” or “arm”. The non-complementary end of the secondprobe preferably comprises a “generic” or “unique” sequence, frequentlyreferred to herein as a “detection sequence”, that is used to indicatethe presence or absence of the target nucleic acid, as described below.The detection sequence of the second probe preferably comprises at leastone detectable label, although as outlined herein, since this detectionsequence can function as a target sequence for a capture probe, sandwichconfigurations utilizing label probes as described herein may also bedone.

Hybridization of the first and second oligonucleotides near or adjacentto one another on the target nucleic acid forms a number of structures.In a preferred embodiment, a forked cleavage structure forms and is asubstrate of a nuclease which cleaves the detection sequence from thesignal oligonucleotide. The site of cleavage is controlled by thedistance or overlap between the 3′ end of the invader oligonucleotideand the downstream fork of the signal oligonucleotide. Therefore,neither oligonucleotide is subject to cleavage when misaligned or whenunattached to target nucleic acid.

In a preferred embodiment, the nuclease that recognizes the forkedcleavage structure and catalyzes release of the tail is thermostable,thereby, allowing thermal cycling of the cleavage reaction, if desired.Preferred nucleases derived from thermostable DNA polymerases that havebeen modified to have reduced synthetic activity which is an undesirableside-reaction during cleavage are disclosed in U.S. Pat. Nos. 5,719,028and 5,843,669, hereby expressly by reference. The synthetic activity ofthe DNA polymerase is reduced to a level where it does not interferewith detection of the cleavage reaction and detection of the freed tail.Preferably the DNA polymerase has no detectable polymerase activity.Examples of nucleases are those derived from Thermus aquaticus, Thermusflavus, or Thermus thermophilus.

In another embodiment, thermostable structure-specific nucleases areFlap endonucleases (FENs) selected from FEN-1 or FEN-2 like (e.g. XPGand RAD2 nucleases) from Archaebacterial species, for example, FEN-1from Methanococcus jannaschif, Pyrococcus furiosis, Pyrococcus woesei,and Archaeoglobus fulgidus. (U.S. Pat. No. 5,843,669 and Lyamichev etal. 1999. Nature Biotechnology 17:292-297; both of which are herebyexpressly by reference).

In a preferred embodiment, the nuclease is AfuFEN1 or PfuFEN1 nuclease.To cleave a forked structure, these nucleases require at least oneoverlapping nucleotide between the signal and invasive probes torecognize and cleave the 5′ end of the signal probe. To effect cleavagethe 3′-terminal nucleotide of the invader oligonucleotide is notrequired to be complementary to the target nucleic acid. In contest,mismatch of the signal probe one base upstream of the cleavage siteprevents creation of the overlap and cleavage. The specificity of thenuclease reaction allows single nucleotide polymorphism (SNP) detectionfrom, for example, genomic DNA, as outlined below (Lyamichev et al.).

The invasive cleavage assay is preferably performed on an array format.In a preferred embodiment, the signal probe has a detectable label,attached 5′ from the site of nuclease cleavage (e.g. within thedetection sequence) and a capture tag, as described below (e.g. biotinor other hapten) 3′ from the site of nuclease cleavage. After the assayis carried out, the 3′ portion of the cleaved signal probe (e.g. thedetection sequence) are extracted, for example, by binding tostreptavidin beads or by crosslinking through the capture tag to produceaggregates or by antibody to an attached hapten. By “capture tag” hereinis a meant one of a pair of binding partners as described above, such asantigen/antibody pairs, digoxygenenin, dinitrophenol, etc.

The cleaved 5′ region, e.g. the detection sequence, of the signal probe,comprises a label and is detected and optionally quantitated. In oneembodiment, the cleaved 5′ region is hybridized to a probe on an array(capture probe) and optically detected. As described below, many signalprobes can be analyzed in parallel by hybridization to theircomplementary probes in an array.

In a preferred embodiment, the invasive cleavage reaction is configuredto utilize a fluorophore-quencher reaction. A signaling probe comprisingboth a fluorophore and a quencher is used, with the fluorophore and thequencher on opposite sides of the cleavage site. As will be appreciatedby those in the art, these will be positioned closely together. Thus, inthe absence of cleavage, very little signal is seen due to the quenchingreaction. After cleavage, however, the distance between the two islarge, and thus fluorescence can be detected. Upon assembly of an assaycomplex, comprising the target sequence, an invader probe, and asignaling probe, and the introduction of the cleavage enzyme, thecleavage of the complex results in the disassociation of the quencherfrom the complex, resulting in an increase in fluorescence.

In this embodiment, suitable fluorophore-quencher pairs are as known inthe art. For example, suitable quencher molecules comprise Dabcyl.

As will be appreciated by those in the art, this system can beconfigured in a variety of conformations, as discussed in FIG. 4.

In a preferred embodiment, to obtain higher specificity and reduce thedetection of contaminating uncleaved signal probe or incorrectly cleavedproduct, an additional enzymatic recognition step is introduced in thearray capture procedure. For example, the cleaved signal probe binds toa capture probe to produce a double-stranded nucleic acid in the array.In this embodiment, the 3′ end of the cleaved signal probe is adjacentto the 5′ end of one strand of the capture probe, thereby, forming asubstrate for DNA ligase (Broude et al. 1991. PNAS 91: 3072-3076). Onlycorrectly cleaved product is ligated to the capture probe. Otherincorrectly hybridized and non-cleaved signal probes are removed, forexample, by heat denaturation, high stringency washes, and other methodsthat disrupt base pairing.

Cycling Probe Techniques (CPT)

In a preferred embodiment, the signal amplification technique is CPT.CPT technology is described in a number of patents and patentapplications, including U.S. Pat. Nos. 5,011,769, 5,403,711, 5,660,988,and 4,876,187, and PCT published applications WO 95/05480, WO 95/1416,and WO 95/00667, and U.S. Ser. No. 09/014,304, all of which areexpressly incorporated by reference in their entirety.

Generally, CPT may be described as follows. A CPT primer (also sometimesreferred to herein as a “scissile primer”), comprises two probesequences separated by a scissile linkage. The CPT primer issubstantially complementary to the target sequence and thus willhybridize to it to form a hybridization complex. The scissile linkage iscleaved, without cleaving the target sequence, resulting in the twoprobe sequences being separated. The two probe sequences can thus bemore easily disassociated from the target, and the reaction can berepeated any number of times. The cleaved primer is then detected asoutlined herein.

By “scissile linkage” herein is meant a linkage within the scissileprobe that can be cleaved when the probe is part of a hybridizationcomplex, that is, when a double-stranded complex is formed. It isimportant that the scissile linkage cleave only the scissile probe andnot the sequence to which it is hybridized (i.e. either the targetsequence or a probe sequence), such that the target sequence may bereused in the reaction for amplification of the signal. As used herein,the scissile linkage, is any connecting chemical structure which joinstwo probe sequences and which is capable of being selectively cleavedwithout cleavage of either the probe sequences or the sequence to whichthe scissile probe is hybridized. The scissile linkage may be a singlebond, or a multiple unit sequence. As will be appreciated by those inthe art, a number of possible scissile linkages may be used.

In a preferred embodiment, the scissile linkage comprises RNA. Thissystem, previously described in as outlined above, is based on the factthat certain double-stranded nucleases, particularly ribonucleases, willnick or excise RNA nucleosides from a RNA:DNA hybridization complex. Ofparticular use in this embodiment is RNAse H, Exo III, and reversetranscriptase.

In one embodiment, the entire scissile probe is made of RNA, the nickingis facilitated especially when carried out with a double-strandedribonuclease, such as RNAse H or Exo III. RNA probes made entirely ofRNA sequences are particularly useful because first, they can be moreeasily produced enzymatically, and second, they have more cleavage siteswhich are accessible to nicking or cleaving by a nicking agent, such asthe ribonucleases. Thus, scissile probes made entirely of RNA do notrely on a scissile linkage since the scissile linkage is inherent in theprobe.

In a preferred embodiment, when the scissile linkage is a nucleic acidsuch as RNA, the methods of the invention may be used to detectmismatches, as is generally described in U.S. Pat. No. 5,660,988, and WO95/14106, hereby expressly incorporated by reference. These mismatchdetection methods are based on the fact that RNAse H may not bind toand/or cleave an RNA:DNA duplex if there are mismatches present in thesequence. Thus, in the NA1-R-NA2 embodiments, NA1 and NA2 are non-RNAnucleic acids, preferably DNA. Preferably, the mismatch is within theRNA:DNA duplex, but in some embodiments the mismatch is present in anadjacent sequence very close to the desired sequence, close enough toaffect the RNAse H (generally within one or two bases). Thus, in thisembodiment, the nucleic acid scissile linkage is designed such that thesequence of the scissile linkage reflects the particular sequence to bedetected, i.e. the area of the putative mismatch.

In some embodiments of mismatch detection, the rate of generation of thereleased fragments is such that the methods provide, essentially, ayes/no result, whereby the detection of virtually any released fragmentindicates the presence of the desired target sequence. Typically,however, when there is only a minimal mismatch (for example, a 1-, 2- or3-base mismatch, or a 3-base deletion), there is some generation ofcleaved sequences even though the target sequence is not present. Thus,the rate of generation of cleaved fragments, and/or the final amount ofcleaved fragments, is quantified to indicate the presence or absence ofthe target. In addition, the use of secondary and tertiary scissileprobes may be particularly useful in this embodiment, as this canamplify the differences between a perfect match and a mismatch. Thesemethods may be particularly useful in the determination of homozygoticor heterozygotic states of a patient.

In this embodiment, it is an important feature of the scissile linkagethat its length is determined by the suspected difference between thetarget and the probe. In particular, this means that the scissilelinkage must be of sufficient length to encompass the suspecteddifference, yet short enough so that the scissile linkage cannotinappropriately “specifically hybridize” to the selected nucleic acidmolecule when the suspected difference is present; such inappropriatehybridization would permit excision and thus cleavage of scissilelinkages even though the selected nucleic acid molecule was not fullycomplementary to the nucleic acid probe. Thus in a preferred embodiment,the scissile linkage is between 3 to 5 nucleotides in length, such thata suspected nucleotide difference from 1 nucleotide to 3 nucleotides isencompassed by the scissile linkage, and 0, 1 or 2 nucleotides are oneither side of the difference.

Thus, when the scissile linkage is nucleic acid, preferred embodimentsutilize from 1 to about 100 nucleotides, with from about 2 to about 20being preferred and from about 5 to about 10 being particularlypreferred.

CPT may be done enzymatically or chemically. That is, in addition toRNAse H, there are several other cleaving agents which may be useful incleaving RNA (or other nucleic acid) scissile bonds. For example,several chemical nucleases have been reported; see for example Sigman etal., Annu. Rev. Biochem. 1990, 59, 207-236; Sigman et al., Chem. Rev.1993, 93, 2295-2316; Bashkin et al., J. Org. Chem. 1990, 55, 5125-5132;and Sigman et al., Nucleic Acids and Molecular Biology, vol. 3, F.Eckstein and D. MJ. Lilley (Eds), Springer-Verlag, Heidelberg 1989, pp.13-27; all of which are hereby expressly incorporated by reference.

Specific RNA hydrolysis is also an active area; see for example Chin,Acc. Chem. Res. 1991, 24, 145-152; Breslow et al., Tetrahedron, 1991,47, 2365-2376; Anslyn et al., Angew. Chem. Int. Ed. Engl., 1997, 36,432-450; and references therein, all of which are expressly incorporatedby reference. Reactive phosphate centers are also of interest indeveloping scissile linkages, see Hendry et al., Prog. Inorg. Chem.:Bioinorganic Chem. 1990, 31, 201-258 also expressly incorporated byreference.

Current approaches to site-directed RNA hydrolysis include theconjugation of a reactive moiety capable of cleaving phosphodiesterbonds to a recognition element capable of sequence-specificallyhybridizing to RNA. In most cases, a metal complex is covalentlyattached to a DNA strand which forms a stable heteroduplex. Uponhybridization, a Lewis acid is placed in close proximity to the RNAbackbone to effect hydrolysis; see Magda et al., J. Am. Chem. Soc. 1994,116, 7439; Hall et al., Chem. Biology 1994, 1, 185-190; Bashkin et al.,J. Am. Chem. Soc. 1994, 116, 5981-5982; Hall et al., Nucleic Acids Res.1996, 24, 3522; Magda et al., J. Am. Chem. Soc. 1997, 119, 2293; andMagda et al., J. Am. Chem. Soc. 1997, 119, 6947, all of which areexpressly incorporated by reference.

In a similar fashion, DNA-polyamine conjugates have been demonstrated toinduce site-directed RNA strand scission; see for example, Yoshinari etal., J. Am. Chem. Soc. 1991, 113, 5899-5901; Endo et al., J. Org. Chem.1997, 62, 846; and Barbier et al., J. Am. Chem. Soc. 1992, 114,3511-3515, all of which are expressly incorporated by reference.

In a preferred embodiment, the scissile linkage is not necessarily RNA.For example, chemical cleavage moieties may be used to cleave basicsites in nucleic acids; see Belmont, et al., New J. Chem. 1997, 21,47-54; and references therein, all of which are expressly incorporatedherein by reference. Similarly, photo cleavable moieties, for example,using transition metals, may be used; see Moucheron, et al., Inorg.Chem. 1997, 36, 584-592, hereby expressly by reference.

Other approaches rely on chemical moieties or enzymes; see for exampleKeck et al., Biochemistry 1995, 34, 12029-12037; Kirk et al., Chem.Commun. 1998, in press; cleavage of G-U basepairs by metal complexes;see Biochemistry, 1992, 31, 5423-5429; diamine complexes for cleavage ofRNA; Komiyama, et al., J. Org. Chem. 1997, 62, 2155-2160; and Chow etal., Chem. Rev. 1997, 97, 1489-1513, and references therein, all ofwhich are expressly incorporated herein by reference.

The first step of the CPT method requires hybridizing a primary scissileprimer (also called a primary scissile probe) to the target. This ispreferably done at a temperature that allows both the binding of thelonger primary probe and disassociation of the shorter cleaved portionsof the primary probe, as will be appreciated by those in the art. Asoutlined herein, this may be done in solution, or either the target orone or more of the scissile probes may be attached to a solid support.For example, it is possible to utilize ‘anchor probes” on a solidsupport which are substantially complementary to a portion of the targetsequence, preferably a sequence that is not the same sequence to which ascissile probe will bind.

Similarly, as outlined herein, a preferred embodiment has one or more ofthe scissile probes attached to a solid support such as a bead. In thisembodiment, the soluble target diffuses to allow the formation of thehybridization complex between the soluble target sequence and thesupport-bound scissile probe. In this embodiment, it may be desirable toinclude additional scissile linkages in the scissile probes to allow therelease of two or more probe sequences, such that more than one probesequence per scissile probe may be detected, as is outlined below, inthe interests of maximizing the signal.

In this embodiment (and in other techniques herein), preferred methodsutilize cutting or shearing techniques to cut the nucleic acid samplecontaining the target sequence into a size that will allow sufficientdiffusion of the target sequence to the surface of a bead. This may beaccomplished by shearing the nucleic acid through mechanical forces(e.g. sonication) or by cleaving the nucleic acid using restrictionendonucleases. Alternatively, a fragment containing the target may begenerated using polymerase, primers and the sample as a template, as inpolymerase chain reaction (PCR). In addition, amplification of thetarget using PCR or LCR or related methods may also be done; this may beparticularly useful when the target sequence is present in the sample atextremely low copy numbers. Similarly, numerous techniques are known inthe art to increase the rate of mixing and hybridization includingagitation, heating, techniques that increase the overall concentrationsuch as precipitation, drying, dialysis, centrifugation,electrophoresis, magnetic bead concentration, etc.

In general, the scissile probes are introduced in a molar excess totheir targets (including both the target sequence or other scissileprobes, for example when secondary or tertiary scissile probes areused), with ratios of scissile probe:target of at least about 100:1being preferred, at least about 1000:1 being particularly preferred, andat least about 10,000:1 being especially preferred. In some embodimentsthe excess of probe: target will be much greater. In addition, ratiossuch as these may be used for all the amplification techniques outlinedherein.

Once the hybridization complex between the primary scissile probe andthe target has been formed, the complex is subjected to cleavageconditions. As will be appreciated, this depends on the composition ofthe scissile probe; if it is RNA, RNAse H is introduced. It should benoted that under certain circumstances, such as is generally outlined inWO 95/00666 and WO 95/00667, hereby incorporated by reference, the useof a double-stranded binding agent such as RNAse H may allow thereaction to proceed even at temperatures above the Tm of the primaryprobe:target hybridization complex. Accordingly, the addition ofscissile probe to the target can be done either first, and then thecleavage agent or cleavage conditions introduced, or the probes may beadded in the presence of the cleavage agent or conditions.

The cleavage conditions result in the separation of the two (or more)probe sequences of the primary scissile probe. As a result, the shorterprobe sequences will no longer remain hybridized to the target sequence,and thus the hybridization complex will disassociate, leaving the targetsequence intact.

The optimal temperature for carrying out the CPT reactions is generallyfrom about 5° C. to about 25° C. below the melting temperatures of theprobe:target hybridization complex. This provides for a rapid rate ofhybridization and high degree of specificity for the target sequence.The Tm of any particular hybridization complex depends on saltconcentration, G-C content, and length of the complex, as is known inthe art and described herein.

During the reaction, as for the other amplification techniques herein,it may be necessary to suppress cleavage of the probe, as well as thetarget sequence, by nonspecific nucleases. Such nucleases are generallyremoved from the sample during the isolation of the DNA by heating orextraction procedures. A number of inhibitors of single-strandednucleases such as vanadate, inhibitors it-ACE and RNAsin, a placentalprotein, do not affect the activity of RNAse H. This may not benecessary depending on the purity of the RNAse H and/or the targetsample.

These steps are repeated by allowing the reaction to proceed for aperiod of time. The reaction is usually carried out for about 15 minutesto about 1 hour. Generally, each molecule of the target sequence willturnover between 100 and 1000 times in this period, depending on thelength and sequence of the probe, the specific reaction conditions, andthe cleavage method. For example, for each copy of the target sequencepresent in the test sample 100 to 1000 molecules will be cleaved byRNAse H. Higher levels of amplification can be obtained by allowing thereaction to proceed longer, or using secondary, tertiary, or quaternaryprobes, as is outlined herein.

Upon completion of the reaction, generally determined by time or amountof cleavage, the uncleaved scissile probes must be removed orneutralized prior to detection, such that the uncleaved probe does notbind to a, detection probe, causing false positive signals. This may bedone in a variety of ways, as is generally described below.

In a preferred embodiment, the separation is facilitated by the use ofbeads containing the primary probe. Thus, when the scissile probes areattached to beads, removal of the beads by filtration, centrifugation,the application of a magnetic field, electrostatic interactions forcharged beads, adhesion, etc., results in the removal of the uncleavedprobes.

In a preferred embodiment, the separation is based on strong acidprecipitation. This is useful to separate long (generally greater than50 nucleotides) from smaller fragments (generally about 10 nucleotides).The introduction of a strong acid such as trichloroacetic acid into thesolution causes the longer probe to precipitate, while the smallercleaved fragments remain in solution. The solution can be centrifuged orfiltered to remove the precipitate, and the cleaved probe sequences canbe quantitated.

In a preferred embodiment, the scissile probe contains both a detectablelabel and an affinity binding ligand or moiety, such that an affinitysupport is used to carry out. the separation. In this embodiment, it isimportant that the detectable label used for detection is not on thesame probe sequence that contains the affinity moiety, such that removalof the uncleaved probe, and the cleaved probe containing the affinitymoiety, does not remove all the detectable labels. Alternatively, thescissile probe may contain a capture tag; the binding partner of thecapture tag is attached to a solid support such as glass beads, latexbeads, dextrans, etc. and used to pull out the uncleaved probes, as isknown in the art. The cleaved probe sequences, which do not contain thecapture tag, remain in solution and then can be detected as outlinedbelow.

In a preferred embodiment, similar to the above embodiment, a separationsequence of nucleic acid is included in the scissile probe, which is notcleaved during the reaction. A nucleic acid complementary to theseparation sequence is attached to a solid support such as a bead andserves as a catcher sequence. Preferably, the separation sequence isadded to the scissile probes, and is not recognized by the targetsequence, such that a generalized catcher sequence may be utilized in avariety of assays.

After removal of the uncleaved probe, as required, detection proceedsvia the addition of the cleaved probe sequences to the arraycompositions, as outlined below. In general, the cleaved probe is boundto a capture probe, either directly or indirectly, and the label isdetected. In a preferred embodiment, no higher order probes are used,and detection is based on the probe sequence(s) of the primary primer.In a preferred embodiment, at least one, and preferably more, secondaryprobes (also referred to herein as secondary primers) are used; thesecondary probes hybridize to the domains of the cleavage probes; etc.

Thus, CPT requires, again in no particular order, a first CPT primercomprising a first probe sequence, a scissile linkage and a second probesequence; and a cleavage agent.

In this manner, CPT results in the generation of a large amount ofcleaved primers, which then can be detected as outlined below.

Sandwich Assay Techniques

In a preferred embodiment, the signal amplification technique is a“sandwich” assay, as is generally described in U.S. Ser. No. 60/073,011and in U.S. Pat. Nos. 5,681,702, 5,597,909, 5,545,730, 5,594,117,5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802,5,635,352, 5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of whichare hereby incorporated by reference. Although sandwich assays do notresult in the alteration of primers, sandwich assays can be consideredsignal amplification techniques since multiple signals (i.e. labelprobes) are bound to a single target, resulting in the amplification ofthe signal. Sandwich assays may be used when the target sequence doesnot contain a label; or when adapters are used, as outlined below.

As discussed herein, it should be noted that the sandwich assays can beused for the detection of primary target sequences (e.g. from a patientsample), or as a method to detect the product of an amplificationreaction as outlined above; thus for example, any of the newlysynthesized strands outlined above, for example using PCR, LCR, NASBA,SDA, etc., may be used as the “target sequence” in a sandwich assay.

As will be appreciated by those in the art, the systems of the inventionmay take on a large number of different configurations. In general,there are three types of systems that can be used: (1) “non-sandwich”systems (also referred to herein as “direct” detection) in which thetarget sequence itself is labeled with detectable labels (again, eitherbecause the primers comprise labels or due to the incorporation oflabels into the newly synthesized strand); (2) systems in which labelprobes directly bind to the target sequences; and (3) systems in whichlabel probes are indirectly bound to the target sequences, for examplethrough the use of amplifier probes.

The anchoring of the target sequence to the bead is done through the useof capture probes and optionally either capture extender probes(sometimes referred to as “adapter sequences” herein). When only captureprobes are utilized, it is necessary to have unique capture probes foreach target sequence; that is, the surface must be customized to containunique capture probes; e.g. each bead comprises a different captureprobe. Alternatively, capture extender probes may be used, that allow a“universal” surface, i.e. a surface containing a single type of captureprobe that can be used to detect any target sequence. “Capture extender”probes have a first portion that will hybridize to all or part of thecapture probe, and a second portion that will hybridize to a firstportion of the target sequence. This then allows the generation ofcustomized soluble probes, which as will be appreciated by those in theart is generally simpler and less costly. As shown herein, two captureextender probes may be used. This has generally been done to stabilizeassay complexes for example when the target sequence is large, or whenlarge amplifier probes (particularly branched or dendrimer amplifierprobes) are used.

Detection of the amplification reactions of the invention, including thedirect detection of amplification products and indirect detectionutilizing label probes (i.e. sandwich assays), is preferably done bydetecting assay complexes comprising detectable labels, which can beattached to the assay complex in a variety of ways, as is more fullydescribed below.

Once the target sequence has preferably been anchored to the array, anamplifier probe is hybridized to the target sequence, either directly,or through the use of one or more label extender probes, which serves toallow “generic” amplifier probes to be made. As for all the stepsoutlined herein, this may be done simultaneously with capturing, orsequentially. Preferably, the amplifier probe contains a multiplicity ofamplification sequences, although in some embodiments, as describedbelow, the amplifier probe may contain only a single amplificationsequence, or at least two amplification sequences. The amplifier probemay take on a number of different forms; either a branched conformation,a dendrimer conformation, or a linear “string” of amplificationsequences. Label probes comprising detectable labels (preferably but notrequired to be fluorophores) then hybridize to the amplificationsequences (or in some cases the label probes hybridize directly to thetarget sequence), and the labels detected, as is more fully outlinedbelow.

Accordingly, the present invention provides compositions comprising anamplifier probe. By “amplifier probe” or “nucleic acid multimer” or“amplification multimer” or grammatical equivalents herein is meant anucleic acid probe that is used to facilitate signal amplification.Amplifier probes comprise at least a first single-stranded nucleic acidprobe sequence, as defined below, and at least one single-strandednucleic acid amplification sequence, with a multiplicity ofamplification sequences being preferred.

Amplifier probes comprise a first probe sequence that is used, eitherdirectly or indirectly, to hybridize to the target sequence. That is,the amplifier probe itself may have a first probe sequence that issubstantially complementary to the target sequence, or it has a firstprobe sequence that is substantially complementary to a portion of anadditional probe, in this case called a label extender probe, that has afirst portion that is substantially complementary to the targetsequence. In a preferred embodiment, the first probe sequence of theamplifier probe is substantially complementary to the target sequence.

In general, as for all the probes herein, the first probe sequence is ofa length sufficient to give specificity and stability. Thus generally,the probe sequences of the invention that are designed to hybridize toanother nucleic acid (i.e. probe sequences, amplification sequences,portions or domains of larger probes) are at least about 5 nucleosideslong, with at least about 10 being preferred and at least about 15 beingespecially preferred.

In a preferred embodiment, several different amplifier probes are used,each with first probe sequences that will hybridize to a differentportion of the target sequence. That is, there is more than one level ofamplification; the amplifier probe provides an amplification of signaldue to a multiplicity of labelling events, and several differentamplifier probes, each with this multiplicity of labels, for each targetsequence is used. Thus, preferred embodiments utilize at least twodifferent pools of amplifier probes, each pool having a different probesequence for hybridization to different portions of the target sequence;the only real limitation on the number of different amplifier probeswill be the length of the original target sequence. In addition, it isalso possible that the different amplifier probes contain differentamplification sequences, although this is generally not preferred.

In a preferred embodiment, the amplifier probe does not hybridize to thesample target sequence directly, but instead hybridizes to a firstportion of a label extender probe. This is particularly useful to allowthe use of “generic” amplifier probes, that is, amplifier probes thatcan be used with a variety of different targets. This may be desirablesince several of the amplifier probes require special synthesistechniques. Thus, the addition of a relatively short probe as a labelextender probe is preferred. Thus, the first probe sequence of theamplifier probe is substantially complementary to a first portion ordomain of a first label extender single-stranded nucleic acid probe. Thelabel extender probe also contains a second portion or domain that issubstantially complementary to a portion of the target sequence. Both ofthese portions are preferably at least about 10 to about 50 nucleotidesin length, with a range of about 15 to about 30 being preferred. Theterms “first” and “second” are not meant to confer an orientation of thesequences with respect to the 5′-3′ orientation of the target or probesequences. For example, assuming a 5′-3′ orientation of thecomplementary target sequence, the first portion may be located either5′ to the second portion, or 3′ to the second portion. For convenienceherein, the order of probe sequences are generally shown from left toright.

In a preferred embodiment, more than one label extender probe-amplifierprobe pair may be used, that is, n is more than 1. That is, a pluralityof label extender probes may be used, each with a portion that issubstantially complementary to a different portion of the targetsequence; this can serve as another level of amplification. Thus, apreferred embodiment utilizes pools of at least two label extenderprobes, with the upper limit being set by the length of the targetsequence.

In a preferred embodiment, more than one label extender probe is usedwith a single amplifier probe to reduce non-specific binding, as isgenerally outlined in U.S. Pat. No. 5,681,697, incorporated by referenceherein. In this embodiment, a first portion of the first label extenderprobe hybridizes to a first portion of the target sequence, and thesecond portion of the first label extender probe hybridizes to a firstprobe sequence of the amplifier probe. A first portion of the secondlabel extender probe hybridizes to a second portion of the targetsequence, and the second portion of the second label extender probehybridizes to a second probe sequence of the amplifier probe. These formstructures sometimes referred to as “cruciform” structures orconfigurations, and are generally done to confer stability when largebranched or dendrimeric amplifier probes are used.

In addition, as will be appreciated by those in the art, the labelextender probes may interact with a preamplifier probe, described below,rather than the amplifier probe directly.

Similarly, as outlined above, a preferred embodiment utilizes severaldifferent amplifier probes, each with first probe sequences that willhybridize to a different portion of the label extender probe. Inaddition, as outlined above, it is also possible that the differentamplifier probes contain different amplification sequences, althoughthis is generally not preferred.

In addition to the first probe sequence, the amplifier probe alsocomprises at least one amplification sequence. An “amplificationsequence” or “amplification segment” or grammatical equivalents hereinis meant a sequence that is used, either directly or indirectly, to bindto a first portion of a label probe as is more fully described below.Preferably, the amplifier probe comprises a multiplicity ofamplification sequences, with from about 3 to about 1000 beingpreferred, from about 10 to about 100 being particularly preferred, andabout 50 being especially preferred. In some cases, for example whenlinear amplifier probes are used, from 1 to about 20 is preferred withfrom about 5 to about 10 being particularly preferred.

The amplification sequences may be linked to each other in a variety ofways, as will be appreciated by those in the art. They may be covalentlylinked directly to each other, or to intervening sequences or chemicalmoieties, through nucleic acid linkages such as phosphodiester bonds,PNA bonds, etc., or through interposed linking agents such amino acid,carbohydrate or polyol bridges, or through other cross-linking agents orbinding partners. The site(s) of linkage may be at the ends of asegment, and/or at one or more internal nucleotides in the strand. In apreferred embodiment, the amplification sequences are attached vianucleic acid linkages.

In a preferred embodiment, branched amplifier probes are used, as aregenerally described in U.S. Pat. No. 5,124,246, hereby incorporated byreference. Branched amplifier probes may take on “fork-like” or“comb-like” conformations. “Fork-like” branched amplifier probesgenerally have three or more oligonucleotide segments emanating from apoint of origin to form a branched structure. The point of origin may beanother nucleotide segment or a multifunctional molecule to which atleast three segments can be covalently or tightly bound. “Comb-like”branched amplifier probes have a linear backbone with a multiplicity ofsidechain oligonucleotides extending from the backbone. In eitherconformation, the pendant segments will normally depend from a modifiednucleotide or other organic moiety having the appropriate functionalgroups for attachment of oligonucleotides. Furthermore, in eitherconformation, a large number of amplification sequences are availablefor binding, either directly or indirectly, to detection probes. Ingeneral, these structures are made as is known in the art, usingmodified multifunctional nucleotides, as is described in U.S. Pat. Nos.5,635,352 and 5,124,246, among others.

In a preferred embodiment, dendrimer amplifier probes are used, as aregenerally described in U.S. Pat. No. 5,175,270, hereby expresslyincorporated by reference. Dendrimeric amplifier probes haveamplification sequences that are attached via hybridization, and thushave portions of double-stranded nucleic acid as a component of theirstructure. The outer surface of the dendrimer amplifier probe has amultiplicity of amplification sequences.

In a preferred embodiment, linear amplifier probes are used, that haveindividual amplification sequences linked end-to-end either directly orwith short intervening sequences to form a polymer. As with the otheramplifier configurations, there may be additional sequences or moietiesbetween the amplification sequences. In one embodiment, the linearamplifier probe has a single amplification sequence.

In addition, the amplifier probe may be totally linear, totallybranched, totally dendrimeric, or any combination thereof.

The amplification sequences of the amplifier probe are used, eitherdirectly, or indirectly, to bind to a label probe to allow detection. Ina preferred embodiment, the amplification sequences of the amplifierprobe are substantially complementary to a first portion of a labelprobe. Alternatively, amplifier extender probes are used, that have afirst portion that binds to the amplification sequence and a secondportion that binds to the first portion of the label probe.

In addition, the compositions of the invention may include“preamplifier” molecules, which serves a bridging moiety between thelabel extender molecules and the amplifier probes. In this way, moreamplifier and thus more labels are ultimately bound to the detectionprobes. Preamplifier molecules may be either linear or branched, andtypically contain in the range of about 30-3000 nucleotides.

Thus, label probes are either substantially complementary to anamplification sequence or to a portion of the target sequence.

Detection of the amplification reactions of the invention, including thedirect detection of amplification products and indirect detectionutilizing label probes (i.e. sandwich assays), is done by detectingassay complexes comprising labels as is outlined herein.

In addition to amplification techniques, the present invention alsoprovides a variety of genotyping reactions that can be similarlydetected and/or quantified.

Genotyping

In this embodiment, the invention provides compositions and methods forthe detection (and optionally quantification) of differences orvariations of sequences (e.g. SNPs) using bead arrays for detection ofthe differences. That is, the bead array serves as a platform on which avariety of techniques may be used to elucidate the nucleotide at theposition of interest (“the detection position”). In general, the methodsdescribed herein relate to the detection of nucleotide substitutions,although as will be appreciated by those in the art, deletions,insertions, inversions, etc. may also be detected.

These techniques fall into five general categories: (1) techniques thatrely on traditional hybridization methods that utilize the variation ofstringency conditions (temperature, buffer conditions, etc.) todistinguish nucleotides at the detection position; (2) extensiontechniques that add a base (“the base”) to basepair with the nucleotideat the detection position; (3) ligation techniques, that rely on thespecificity of ligase enzymes (or, in some cases, on the specificity ofchemical techniques), such that ligation reactions occur preferentiallyif perfect complementarity exists at the detection position; (4)cleavage techniques, that also rely on enzymatic or chemical specificitysuch that cleavage occurs preferentially if perfect complementarityexists; and (5) techniques that combine these methods.

As outlined herein, in this embodiment the target sequence comprises aposition for which sequence information is desired, generally referredto herein as the “detection position” or “detection locus”. In apreferred embodiment, the detection position is a single nucleotide,although in some embodiments, it may comprise a plurality ofnucleotides, either contiguous with each other or separated by one ormore nucleotides. By “plurality” as used herein is meant at least two.As used herein, the base which basepairs with a detection position basein a hybrid is termed a “readout position” or an “interrogationposition”.

In some embodiments, as is outlined herein, the target sequence may notbe the sample target sequence but instead is a product of a reactionherein, sometimes referred to herein as a “secondary” or “derivative”target sequence. Thus, for example, in SBE, the extended primer mayserve as the target sequence; similarly, in invasive cleavagevariations, the cleaved detection sequence may serve as the targetsequence.

As above, if required, the target sequence is prepared using knowntechniques. Once prepared, the target sequence can be used in a varietyof reactions for a variety of reasons. For example, in a preferredembodiment, genotyping reactions are done. Similarly, these reactionscan also be used to detect the presence or absence of a target sequence.In addition, in any reaction, quantitation of the amount of a targetsequence may be done. While the discussion below focuses on genotypingreactions, the discussion applies equally to detecting the presence oftarget sequences and/or their quantification.

Furthermore, as outlined below for each reaction, each of thesetechniques may be used in a solution based assay, wherein the reactionis done in solution and a reaction product is bound to the array forsubsequent detection, or in solid phase assays, where the reactionoccurs on the surface and is detected.

These reactions are generally classified into 5 basic categories, asoutlined below.

Simple Hybridization Genotyping

In a preferred embodiment, straight hybridization methods are used toelucidate the identity of the base at the detection position. Generallyspeaking, these techniques break down into two basic types of reactions:those that rely on competitive hybridization techniques, and those thatdiscriminate using stringency parameters and combinations thereof.

Competitive Hybridization

In a preferred embodiment, the use of competitive hybridization probesis done to elucidate either the identity of the nucleotide(s) at thedetection position or the presence of a mismatch. For example,sequencing by hybridization has been described (Drmanac et al., Genomics4:114 (1989); Koster et al., Nature Biotechnology 14:1123 (1996); U.S.Pat. Nos. 5,525,464; 5,202,231 and 5,695,940, among others, all of whichare hereby expressly incorporated by reference in their entirety).

It should be noted in this context that “mismatch” is a relative termand meant to indicate a difference in the identity of a base at aparticular position, termed the “detection position” herein, between twosequences. In general, sequences that differ from wild type sequencesare referred to as mismatches. However, particularly in the case ofSNPs, what constitutes “wild type” may be difficult to determine asmultiple alleles can be relatively frequently observed in thepopulation, and thus “mismatch” in this context requires the artificialadoption of one sequence as a standard. Thus, for the purposes of thisinvention, sequences are referred to herein as “match” and “mismatch”.Thus, the present invention may be used to detect substitutions,insertions or deletions as compared to a wild-type sequence.

In a preferred embodiment, a plurality of probes (sometimes referred toherein as “readout probes”) are used to identify the base at thedetection position. In this embodiment, each different readout probecomprises a different detection label (which, as outlined below, can beeither a primary label or a secondary label) and a different base at theposition that will hybridize to the detection position of the targetsequence (herein referred to as the readout position) such thatdifferential hybridization will occur. That is, all other parametersbeing equal, a perfectly complementary readout probe (a “match probe”)will in general be more stable and have a slower off rate than a probecomprising a mismatch (a “mismatch probe”) at any particulartemperature. Accordingly, by using different readout probes, each with adifferent base at the readout position and each with a different label,the identification of the base at the detection position is elucidated.

Accordingly, a detectable label is incorporated into the readout probe.In a preferred embodiment, a set of readout probes are used, eachcomprising a different base at the readout position. In someembodiments, each readout probe comprises a different label, that isdistinguishable from the others.

For example, a first label may be used for probes comprising adenosineat the readout position, a second label may be used for probescomprising guanine at the readout position, etc. In a preferredembodiment, the length and sequence of each readout probe is identicalexcept for the readout position, although this need not be true in allembodiments.

The number of readout probes used will vary depending on the end use ofthe assay. For example, many SNPs are biallelic, and thus two readoutprobes, each comprising an interrogation base that will basepair withone of the detection position bases. For sequencing, for example, forthe discovery of SNPs, a set of four readout probes are used, althoughSNPs may also be discovered with fewer readout parameters.

As will be appreciated by those in the art and additionally outlinedbelow, this system can take on a number of different configurations,including a solution phase assay and a solid phase assay.

Solution Phase Assay

A solution phase assay that is followed by attaching the target sequenceto an array is depicted in FIG. 8D. In FIG. 8D, a reaction with twodifferent readout probes is shown. After the competitive hybridizationhas occurred, the target sequence is added to the array, which may takeon several configurations, outlined below.

Solid Phase Assay

In a preferred embodiment, the competition reaction is done on thearray. This system may take on several configurations.

In a preferred embodiment, a sandwich assay of sorts is used. In thisembodiment, the bead comprises a capture probe that will hybridize to afirst target domain of a target sequence, and the readout probe willhybridize to a second target domain, as is generally depicted in FIG.8A. In this embodiment, the first target domain may be either unique tothe target, or may be an exogenous adapter sequence added to the targetsequence as outlined below, for example through the use of PCRreactions. Similarly, a sandwich assay that utilizes a capture extenderprobe, as described below, to attach the target sequence to the array isdepicted in FIG. 8C.

Alternatively, the capture probe itself can be the readout probe as isshown in Figure BB; that is, a plurality of s are used, each comprisinga capture probe that has a different base at the readout position. Ingeneral, the target sequence then hybridizes preferentially to thecapture probe most closely matched. In this embodiment, either thetarget sequence itself is labeled (for example, it may be the product ofan amplification reaction) or a label probe may bind to the targetsequence at a domain remote from the detection position. In thisembodiment, since it is the location on the array that serves toidentify the base at the detection position, different labels are notrequired.

In a further embodiment, the target sequence itself is attached to thearray, as generally depicted for bead arrays in FIG. 8E and describedbelow.

Stringency Variation

In a preferred embodiment, sensitivity to variations in stringencyparameters are used to determine either the identity of thenucleotide(s) at the detection position or the presence of a mismatch.As a preliminary matter, the use of different stringency conditions suchas variations in temperature and buffer composition to determine thepresence or absence of mismatches in double stranded hybrids comprisinga single stranded target sequence and a probe is well known.

With particular regard to temperature, as is known in the art,differences in the number of hydrogen bonds as a function of basepairingbetween perfect matches and mismatches can be exploited as a result oftheir different Tms (the temperature at which 50% of the hybrid isdenatured). Accordingly, a hybrid comprising perfect complementaritywill melt at a higher temperature than one comprising at least onemismatch, all other parameters being equal. (It should be noted that forthe purposes of the discussion herein, all other parameters (i.e. lengthof the hybrid, nature of the backbone (i.e. naturally occurring ornucleic acid analog), the assay solution composition and the compositionof the bases, including G-C content are kept constant). However, as willbe appreciated by those in the art, these factors may be varied as well,and then taken into account.)

In general, as outlined herein, high stringency conditions are thosethat result in perfect matches remaining in hybridization complexes,while imperfect matches melt off. Similarly, low stringency conditionsare those that allow the formation of hybridization complexes with bothperfect and imperfect matches. High stringency conditions are known inthe art; see for example Maniatis et al., Molecular Cloning: ALaboratory Manual, 2d Edition, 1989, and Short Protocols in MolecularBiology, ed. Ausubel, et al., both of which are hereby incorporated byreference. Stringent conditions are sequence-dependent and will bedifferent in different circumstances. Longer sequences hybridizespecifically at higher temperatures. An extensive guide to thehybridization of nucleic acids is found in Tijssen, Techniques inBiochemistry and Molecular Biology-Hybridization with Nucleic AcidProbes, “Overview of principles of hybridization and the strategy ofnucleic acid assays” (1993). Generally, stringent conditions areselected to be about 5-10° C. lower than the thermal melting point (Tm)for the specific sequence at a defined ionic strength pH. The Tm is thetemperature (under defined ionic strength, pH and nucleic acidconcentration) at which 50% of the probes complementary to the targethybridize to the target sequence at equilibrium (as the target sequencesare present in excess, at Tm, 50% of the probes are occupied atequilibrium). Stringent conditions will be those in which the saltconcentration is less than about 1.0 M sodium ion, typically about 0.01to 1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 andthe temperature is at least about 30° C. for short probes (e.g. 10 to 50nucleotides) and at least about 60° C. for long probes (e.g. greaterthan 50 nucleotides). Stringent conditions may also be achieved with theaddition of destabilizing agents such as formamide. In anotherembodiment, less stringent hybridization conditions are used; forexample, moderate or low stringency conditions may be used, as are knownin the art; see Maniatis and Ausubel, supra, and Tijssen, supra.

As will be appreciated by those in the art, mismatch detection usingtemperature may proceed in a variety of ways, and is similar to the useof readout probes as outlined above. Again, as outlined above, aplurality of readout probes may be used in a sandwich format; in thisembodiment, all the probes may bind at permissive, low temperatures(temperatures below the Tm of the mismatch); however, repeating theassay at a higher temperature (above the Tm of the mismatch) only theperfectly matched probe may bind. Thus, this system may be run withreadout probes with different detectable labels, as outlined above.Alternatively, a single probe may be used to query whether a particularbase is present.

Alternatively, as described above, the capture probe may serve as thereadout probe; in this embodiment, a single label may be used on thetarget; at temperatures above the Tm of the mismatch, only signals fromperfect matches will be seen, as the mismatch target will melt off.

Similarly, variations in buffer composition may be used to elucidate thepresence or absence of a mismatch at the detection position. Suitableconditions include, but are not limited to, formamide concentration.Thus, for example, “low” or “permissive” stringency conditions includeformamide concentrations of 0 to 10%, while “high” or “stringent”conditions utilize formamide concentrations of ≧40%. Low stringencyconditions include NaCl concentrations of ≦1 M, and high stringencyconditions include concentrations of ≦0.3 M. Furthermore, low stringencyconditions include MgCl₂ concentrations of ≧10 mM, moderate stringencyas 1-10 mM, and high stringency conditions include concentrations of ≦1mM.

In this embodiment, as for temperature, a plurality of readout probesmay be used, with different bases in the readout position (andoptionally different labels). Running the assays under the permissiveconditions and repeating under stringent conditions will allow theelucidation of the base at the detection position.

In one embodiment, the probes used as readout probes are “MolecularBeacon” probes as are generally described in Whitcombe et al., NatureBiotechnology 17:804 (1999), hereby incorporated by reference. As isknown in the art, Molecular Beacon probes form “hairpin” typestructures, with a fluorescent label on one end and a quencher on theother. In the absence of the target sequence, the ends of the hairpinhybridize, causing quenching of the label. In the presence of a targetsequence, the hairpin structure is lost in favor of target sequencebinding, resulting in a loss of quenching and thus an increase insignal.

In one embodiment, the Molecular Beacon probes can be the capture probesas outlined herein for readout probes. For example, different beadscomprising labeled Molecular Beacon probes (and different bases at thereadout position) are made optionally they comprise different labels.Alternatively, since Molecular Beacon probes can have spectrallyresolvable signals, all four probes (if a set of four different baseswith is used) differently labelled are attached to a single bead.

Extension Genotyping

In this embodiment, any number of techniques are used to add anucleotide to the readout position of a probe hybridized to the targetsequence adjacent to the detection position. By relying on enzymaticspecificity, preferentially a perfectly complementary base is added. Allof these methods rely on the enzymatic incorporation of nucleotides atthe detection position. This may be done using chain terminating dNTPs,such that only a single base is incorporated (e.g. single base extensionmethods), or under conditions that only a single type of nucleotide isadded followed by identification of the added nucleotide (extension andpyrosequencing techniques).

Single Base Extension

In a preferred embodiment, single base extension (SBE; sometimesreferred to as “minisequencing”) is used to determine the identity ofthe base at the detection position. SBE is as described above, andutilizes an extension primer that hybridizes to the target nucleic acidimmediately adjacent to the detection position. A polymerase (generallya DNA polymerase) is used to extend the 3′ end of the primer with anucleotide analog labeled a detection label as described herein. Basedon the fidelity of the enzyme, a nucleotide is only incorporated intothe readout position of the growing nucleic acid strand if it isperfectly complementary to the base in the target strand at thedetection position. The nucleotide may be derivatized such that nofurther extensions can occur, so only a single nucleotide is added. Oncethe labeled nucleotide is added, detection of the label proceeds asoutlined herein.

The reaction is initiated by introducing the assay complex comprisingthe target sequence (i.e. the array) to a solution comprising a firstnucleotide. In general, the nucleotides comprise a detectable label,which may be either a primary or a secondary label. In addition, thenucleotides may be nucleotide analogs, depending on the configuration ofthe system. For example, if the dNTPs are added in sequential reactions,such that only a single type of dNTP can be added, the nucleotides neednot be chain terminating. In addition, in this embodiment, the dNTPs mayall comprise the same type of label.

Alternatively, if the reaction comprises more than one dNTP, the dNTPsshould be chain terminating, that is, they have a blocking or protectinggroup at the 3′ position such that no further dNTPs may be added by theenzyme. As will be appreciated by those in the art, any number ofnucleotide analogs may be used, as long as a polymerase enzyme willstill incorporate the nucleotide at the readout position. Preferredembodiments utilize dideoxy-triphosphate nucleotides (ddNTPs) andhalogenated dNTPs. Generally, a set of nucleotides comprising ddATP,ddCTP, ddGTP and ddTTP is used, each with a different detectable label,although as outlined herein, this may not be required. Alternativepreferred embodiments use acyclo nucleotides (NEN). These chainterminating nucleotide analogs are particularly good substrates for Deepvent (exo′) and thermosequenase.

In addition, as will be appreciated by those in the art, the single baseextension reactions of the present invention allow the preciseincorporation of modified bases into a growing nucleic acid strand.Thus, any number of modified nucleotides may be incorporated for anynumber of reasons, including probing structure-function relationships(e.g. DNA:DNA or DNA:protein interactions), cleaving the nucleic acid,crosslinking the nucleic acid, incorporate mismatches, etc.

As will be appreciated by those in the art, the configuration of thegenotyping SBE system can take on several forms.

Solution Phase Assay

As for the OLA reaction described below, the reaction may be done insolution, and then the newly synthesized strands, with the base-specificdetectable labels, can be detected. For example, they can be directlyhybridized to capture probes that are complementary to the extensionprimers, and the presence of the label is then detected. This isschematically depicted in FIG. 9C. As will be appreciated by those inthe art, a preferred embodiment utilizes four different detectablelabels, i.e. one for each base, such that upon hybridization to thecapture probe on the array, the identification of the base can be doneisothermally. Thus, FIG. 9C depicts the readout position 35 as notnecessarily hybridizing to the capture probe.

In a preferred embodiment, adapter sequences can be used in a solutionformat. In this embodiment, a single label can be used with a set offour separate primer extension reactions. In this embodiment, theextension reaction is done in solution; each reaction comprises adifferent dNTP with the label or labeled ddNTP when chain termination isdesired. For each locus genotyped, a set of four different extensionprimers are used, each with a portion that will hybridize to the targetsequence, a different readout base and each with a different adaptersequence of 15-40 bases, as is more fully outlined below. After theprimer extension reaction is complete, the four separate reactions arepooled and hybridized to an array comprising complementary probes to theadapter sequences. A genotype is derived by comparing the probeintensities of the four different hybridized adapter sequencescorresponding to a give locus.

In addition, since unextended primers do not comprise labels, theunextended primers need not be removed. However, they may be, ifdesired, as outlined below; for example, if a large excess of primersare used, there may not be sufficient signal from the extended primerscompeting for binding to the surface.

Alternatively, one of skill in the art could use a single label andtemperature to determine the identity of the base; that is, the readoutposition of the extension primer hybridizes to a position on the captureprobe. However, since the three mismatches will have lower Tms than theperfect match, the use of temperature could elucidate the identity ofthe detection position base.

Solid Phase Assay

Alternatively, the reaction may be done on a surface by capturing thetarget sequence and then running the SBE reaction, in a sandwich typeformat schematically depicted in FIG. 9A. In this embodiment, thecapture probe hybridizes to a first domain of the target sequence (whichcan be endogenous or an exogenous adapter sequence added during anamplification reaction), and the extension primer hybridizes to a secondtarget domain immediately adjacent to the detection position. Theaddition of the enzyme and the required NTPs results in the addition ofthe interrogation base. In this embodiment, each NTP must have a uniquelabel. Alternatively, each NTP reaction may be done sequentially on adifferent array. As is known by one of skill in the art, ddNTP and dNTPare the preferred substrates when DNA polymerase is the added enzyme;NTP is the preferred substrate when RNA polymerase is the added enzyme.

Furthermore, as is more fully outlined below and depicted in FIG. 9D,capture extender probes can be used to attach the target sequence to thebead. In this embodiment, the hybridization complex comprises thecapture probe, the target sequence and the adapter sequence.

Similarly, the capture probe itself can be used as the extension probe,with its terminus being directly adjacent to the detection position.This is schematically depicted in FIG. 9B. Upon the addition of thetarget sequence and the SBE reagents, the modified primer is formedcomprising a detectable label, and then detected. Again, as for thesolution based reaction, each NTP must have a unique label, thereactions must proceed sequentially, or different arrays must be used.Again, as is known by one of skill in the art, ddNTP and dNTP are thepreferred substrates when DNA polymerase is the added enzyme; NTP is thepreferred substrate when RNA polymerase is the added enzyme.

In addition, as outlined herein, the target sequence may be directlyattached to the array; the extension primer hybridizes to it and thereaction proceeds.

Variations on this are shown in FIGS. 9E and 9F, where the capture probeand the extension probe adjacently hybridize to the target sequence.Either before or after extension of the extension probe, a ligation stepmay be used to attach the capture and extension probes together forstability. These are further described below as combination assays.

In addition, FIG. 9G depicts the SBE solution reaction followed byhybridization of the product of the reaction to the bead array tocapture an adapter sequence.

As will be appreciated by those in the art, the determination of thebase at the detection position can proceed in several ways. In apreferred embodiment, the reaction is run with all four nucleotides(assuming all four nucleotides are required), each with a differentlabel, as is generally outlined herein. Alternatively, a single label isused, by using four reactions: this may be done either by using a singlesubstrate and sequential reactions, or by using four arrays. Forexample, dATP can be added to the assay complex, and the generation of asignal evaluated; the dATP can be removed and dTTP added, etc.Alternatively, four arrays can be used; the first is reacted with dATP,the second with dTTP, etc., and the presence or absence of a signalevaluated. Alternatively, the reaction includes chain terminatingnucleotides such as ddNTPs or acyclo-NTPS.

Alternatively, ratio metric analysis can be done; for example, twolabels, “A” and “B”, on two substrates (e.g. two arrays) can be done. Inthis embodiment, two sets of primer extension reactions are performed,each on two arrays, with each reaction containing a complete set of fourchain terminating NTPs. The first reaction contains two “A” labelednucleotides and two “B” labeled nucleotides (for example, A and C may be“A” labeled, and G and T may be “B” labeled). The second reaction alsocontains the two labels, but switched; for example, A and G are “A”labeled and T and Care “B” labeled. This reaction composition allows abiallelic marker to be ratio metrically scored; that is, the intensityof the two labels in two different “color” channels on a singlesubstrate is compared, using data from a set of two hybridized arrays.For instance, if the marker is A1G, then the first reaction on the firstarray is used to calculate a ratio metric genotyping score; if themarker is A1C, then the second reaction on the second array is used forthe calculation; if the marker is GIT, then the second array is used,etc. This concept can be applied to all possible biallelic markercombinations. “Scoring” a genotype using a single fiber ratio metricscore allows a much more robust genotyping than scoring a genotype usinga comparison of absolute or normalized intensities between two differentarrays.

Removal of Unextended Primers

In a preferred embodiment, for both SBE as well as a number of otherreactions outlined herein, it is desirable to remove the unextended orunreacted primers from the assay mixture, and particularly from thearray, as unextended primers will compete with the extended (labeled)primers in binding to capture probes, thereby diminishing the signal.The concentration of the unextended primers relative to the extendedprimer may be relatively high, since a large excess of primer is usuallyrequired to generate efficient primer annealing. Accordingly, a numberof different techniques may be used to facilitate the removal ofunextended primers. As outlined above, these generally include methodsbased on removal of unreacted primers by binding to a solid support,protecting the reacted primers and degrading the unextended ones, andseparating the unreacted and reacted primers.

Protection and Degradation

In this embodiment, the ddTNPs or dNTPs that are added during thereaction confer protection from degradation (whether chemical orenzymatic). Thus, after the assay, the degradation components are added,and unreacted primers are degraded, leaving only the reacted primers.Labeled protecting groups are particularly preferred; for example,3′-substituted-2′-dNTPs can contain anthranylic derivatives that arefluorescent (with alkali or enzymatic treatment for removal of theprotecting group).

In a preferred embodiment, the secondary label is a nuclease inhibitor,such as thiol NTPs. In this embodiment, the chain-terminating NTPs arechosen to render extended primers resistant to nucleases, such as3′-exonucleases. Addition of an exonuclease will digest the non-extendedprimers leaving only the extended primers to bind to the capture probeson the array. This may also be done with OLA, wherein the ligated probewill be protected but the unprotected ligation probe will be digested.

In this embodiment, suitable 3′-exonucleases include, but are notlimited to, exo I, exo III, exo VII, and 3′-5′ exophosphodiesterases.

Alternatively, an 3′ exonuclease may be added to a mixture of 3′ labeledbiotin/streptavidin; only the unreacted oligonucleotides will bedegraded. Following exonuclease treatment, the exonuclease and thestreptavidin can be degraded using a protease such as proteinase K. Thesurviving nucleic acids (i.e. those that were biotinylated) are thenhybridized to the array.

Separation Systems

The use of secondary label systems (and even some primary label systems)can be used to separate unreacted and reacted probes; for example, theaddition of streptavidin to a nucleic acid greatly increases its size,as well as changes its physical properties, to allow more efficientseparation techniques. For example, the mixtures can be sizefractionated by exclusion chromatography, affinity chromatography,filtration or differential precipitation.

Non-Terminated Extension

In a preferred embodiment, methods of adding a single base are used thatdo not rely on chain termination. That is, similar to SBE, enzymaticreactions that utilize dNTPs and polymerases can be used; however,rather than use chain terminating dNTPs, regular dNTPs are used. Thismethod relies on a time-resolved basis of detection; only one type ofbase is added during the reaction. Thus, for example, four differentreactions each containing one of the dNTPs can be done; this isgenerally accomplished by using four different substrates, although aswill be appreciated by those in the art, not all four reactions needoccur to identify the nucleotide at a detection position. In thisembodiment, the signals from single additions can be compared to thosefrom multiple additions; that is, the addition of a single ATP can bedistinguished on the basis of signal intensity from the addition of twoor three ATPs. These reactions are accomplished as outlined above forSBE, using extension primers and polymerases; again, one label or fourdifferent labels can be used, although as outlined herein, the differentNTPs must be added sequentially.

A preferred method of extension in this embodiment is pyrosequencing.

Pyrosequencing

Pyrosequencing is an extension and sequencing method that can be used toadd one or more nucleotides to the detection position(s); it is verysimilar to SBE except that chain terminating NTPs need not be used(although they may be). Pyrosequencing relies on the detection of areaction product, PPi, produced during the addition of an NTP to agrowing oligonucleotide chain, rather than on a label attached to thenucleotide. One molecule of PPi is produced per dNTP added to theextension primer. That is, by running sequential reactions with each ofthe nucleotides, and monitoring the reaction products, the identity ofthe added base is determined.

The release of pyrophosphate (PPi) during the DNA polymerase reactioncan be quantitatively measured by many different methods and a number ofenzymatic methods have been described; see Reeves et al., Anal. Biochem.28:282 (1969); Guillory et al., Anal. Biochem. 39:170 (1971); Johnson etal., Anal. Biochem. 15:273 (1968); Cook et al., Anal. Biochem. 91:557(1978); Drake et al., Anal. Biochem. 94:117 (1979); WO93/23564; WO98/128440; WO98/113523; Nyren et al., Anal. Biochem. 151:504 (1985); allof which are incorporated by reference. The latter method allowscontinuous monitoring of PPi and has been termed ELIDA (EnzymaticLuminometric Inorganic Pyrophosphate Detection Assay). A preferredembodiment utilizes any method which can result in the generation of anoptical signal, with preferred embodiments utilizing the generation of achemiluminescent or fluorescent signal.

A preferred method monitors the creation of PPi by the conversion of PPito ATP by the enzyme sulfurylase, and the subsequent production ofvisible light by firefly luciferase (see Ronaghi et al., Science 281:363(1998), incorporated by reference). In this method, the fourdeoxynucleotides (dATP, dGTP, dCTP and dTTP; collectively dNTPs) areadded stepwise to a partial duplex comprising a sequencing primerhybridized to a single stranded DNA template and incubated with DNApolymerase, ATP sulfurylase, luciferase, and optionally anucleotide-degrading enzyme such as apyrase. A dNTP is only incorporatedinto the growing DNA strand if it is complementary to the base in thetemplate strand. The synthesis of DNA is accompanied by the release ofPPi equal in molarity to the incorporated dNTP. The PPi is converted toATP and the light generated by the luciferase is directly proportionalto the amount of ATP. In some cases the unincorporated dNTPs and theproduced ATP are degraded between each cycle by the nucleotide degradingenzyme.

Accordingly, a preferred embodiment of the methods of the invention isas follows. A substrate comprising s containing the target sequences andextension primers, forming hybridization complexes, is dipped orcontacted with a reaction volume (chamber or well) comprising a singletype of dNTP, an extension enzyme, and the reagents and enzymesnecessary to detect PPi. If the dNTP is complementary to the base of thetarget portion of the target sequence adjacent to the extension primer,the dNTP is added, releasing PPi and generating detectable light, whichis detected as generally described in U.S. Ser. Nos. 09/151,877 and09/189,543, and PCT US98/09163, all of which are hereby incorporated byreference. If the dNTP is not complementary, no detectable signalresults. The substrate is then contacted with a second reaction volume(chamber) comprising a different dNTP and the additional components ofthe assay. This process is repeated if the identity of a base at asecond detection position is desirable.

In a preferred embodiment, washing steps, i.e. the use of washingchambers, may be done in between the dNTP reaction chambers, asrequired. These washing chambers may optionally comprise anucleotide-degrading enzyme, to remove any unreacted dNTP and decreasingthe background signal, as is described in WO 98/28440, incorporatedherein by reference.

As will be appreciated by those in the art, the system can be configuredin a variety of ways, including both a linear progression or a circularone; for example, four arrays may be used that each can dip into one offour reaction chambers arrayed in a circular pattern. Each cycle ofsequencing and reading is followed by a 90 degree rotation, so that eachsubstrate then dips into the next reaction well.

In a preferred embodiment, one or more internal control sequences areused. That is, at least one microsphere in the array comprises a knownsequence that can be used to verify that the reactions are proceedingcorrectly. In a preferred embodiment, at least four control sequencesare used, each of which has a different nucleotide at each position: thefirst control sequence will have an adenosine at position 1, the secondwill have a cytosine, the third a guanosine, and the fourth a thymidine,thus ensuring that at least one control sequence is “lighting up” ateach step to serve as an internal control.

As for simple extension and SBE, the pyrosequencing systems may beconfigured in a variety of ways; for example, the target sequence may beattached to the bead in a variety of ways, including direct attachmentof the target sequence; the use of a capture probe with a separateextension probe; the use of a capture extender probe, a capture probeand a separate extension probe; the use of adapter sequences in thetarget sequence with capture and extension probes; and the use of acapture probe that also serves as the extension probe.

One additional benefit of pyrosequencing for genotyping purposes is thatsince the reaction does not rely on the incorporation of labels into agrowing chain, the unreacted extension primers need not be removed.

Allelic PCR

In a preferred embodiment, the method used to detect the base at thedetection position is allelic PCR, referred to herein as “aPCR”. Asdescribed in Newton et al., Nucl. Acid Res. 17:2503 (1989), herebyexpressly incorporated by reference, allelic PCR allows single basediscrimination based on the fact that the PCR reaction does not proceedwell if the terminal 3′-nucieotide is mismatched, assuming the DNApolymerase being used lacks a 3′-exonuciease proofreading activity.Accordingly, the identification of the base proceeds by using allelicPCR primers (sometimes referred to herein as aPCR primers) that havereadout positions at their 3′ ends. Thus the target sequence comprises afirst domain comprising at its 5′ end a detection position.

In general, aPCR may be briefly described as follows. A double strandedtarget nucleic acid is denatured, generally by raising the temperature,and then cooled in the presence of an excess of a aPCR primer, whichthen hybridizes to the first target strand. If the readout position ofthe aPCR primer basepairs correctly with the detection position of thetarget sequence, a DNA polymerase (again, that lacks 3′-exonucieaseactivity) then acts to extend the primer with dNTPs, resulting in thesynthesis of a new strand forming a hybridization complex. The sample isthen heated again, to disassociate the hybridization complex, and theprocess is repeated. By using a second PCR primer for the complementarytarget strand, rapid and exponential amplification occurs. Thus aPCRsteps are denaturation, annealing and extension. The particulars of aPCRare well known, and include the use of a thermostable polymerase such asTaq I polymerase and thermal cycling.

Accordingly, the aPCR reaction requires at least one aPCR primer, apolymerase, and a set of dNTPs. As outlined herein, the primers maycomprise the label, or one or more of the dNTPs may comprise a label.

Furthermore, the aPCR reaction may be run as a competition assay ofsorts. For example, for biallelic SNPs, a first aPCR primer comprising afirst base at the readout position and a first label, and a second aPCRprimer comprising a different base at the readout position and a secondlabel, may be used. The PCR primer for the other strand is the same. Theexamination of the ratio of the two colors can serve to identify thebase at the detection position.

In general, as is more fully outlined below, the capture probes on thebeads of the array are designed to be substantially complementary to theextended part of the primer; that is, unextended primers will not bindto the capture probes.

Ligation Techniques for Genotyping

In this embodiment, the readout of the base at the detection positionproceeds using a ligase. In this embodiment, it is the specificity ofthe ligase which is the basis of the genotyping; that is, ligasesgenerally require that the 5′ and 3′ ends of the ligation probes haveperfect complementarity to the target for ligation to occur. Thus, in apreferred embodiment, the identity of the base at the detection positionproceeds utilizing OLA as described above, as is generally depicted inFIG. 10. The method can be run at least two different ways; in a firstembodiment, only one strand of a target sequence is used as a templatefor ligation; alternatively, both strands may be used; the latter isgenerally referred to as Ligation Chain Reaction or LCR.

This method is based on the fact that two probes can be preferentiallyligated together, if they are hybridized to a target strand and ifperfect complementarity exists at the two bases being ligated together.Thus, in this embodiment, the target sequence comprises a contiguousfirst target domain comprising the detection position and a secondtarget domain adjacent to the detection position. That is, the detectionposition is “between” the rest of the first target domain and the secondtarget domain. A first ligation probe is hybridized to the first targetdomain and a second ligation probe is hybridized to the second targetdomain. If the first ligation probe has a base perfectly complementaryto the detection position base, and the adjacent base on the secondprobe has perfect complementarity to its position, a ligation structureis formed such that the two probes can be ligated together to form aligated probe. If this complementarity does not exist, no ligationstructure is formed and the probes are not ligated together to anappreciable degree. This may be done using heat cycling, to allow theligated probe to be denatured off the target sequence such that it mayserve as a template for further reactions. In addition, as is more fullyoutlined below, this method may also be done using ligation probes thatare separated by one or more nucleotides, if dNTPs and a polymerase areadded (this is sometimes referred to as “Genetic Bit” analysis).

In a preferred embodiment, LCR is done for two strands of adouble-stranded target sequence. The target sequence is denatured, andtwo sets of probes are added: one set as outlined above for one strandof the target, and a separate set (i.e. third and fourth primer probenucleic acids) for the other strand of the target. In a preferredembodiment, the first and third probes will hybridize, and the secondand fourth probes will hybridize, such that amplification can occur.That is, when the first and second probes have been attached, theligated probe can now be used as a template, in addition to the secondtarget sequence, for the attachment of the third and fourth probes.Similarly, the ligated third and fourth probes will serve as a templatefor the attachment of the first and second probes, in addition to thefirst target strand. In this way, an exponential, rather than just alinear, amplification can occur.

As will be appreciated by those in the art, the ligation product can bedetected in a variety of ways. Preferably, detection is accomplished byremoving the unligated labeled probe from the reaction beforeapplication to a capture probe. In one embodiment, the unligated probesare removed by digesting 3′ non-protected oligonucleotides with a 3′exonuclease, such as, exonuclease I. The ligation products are protectedfrom exo I digestion by including, for example, the use of a number ofsequential phosphorothioate residues at their 3′ terminus (for exampleat least four), thereby, rendering them resistant to exonucleasedigestion. The unligated detection oligonucleotides are not protectedand are digested.

As for most or all of the methods described herein, the assay can takeon a solution-based form or a solid-phase form.

Solution Based OLA

In a preferred embodiment, as shown in FIG. 10A, the ligation reactionis run in solution. In this embodiment, only one of the primers carriesa detectable label, e.g. the first ligation probe, and the capture probeon the bead is substantially complementary to the other probe, e.g. thesecond ligation probe. In this way, unextended labeled ligation primerswill not interfere with the assay. This substantially reduces oreliminates false signal generated by the optically-labeled 3′ primers.

In addition, a solution-based OLA assay that utilizes adapter sequencesmay be done. In this embodiment, rather than have the target sequencecomprise the adapter sequences, one of the ligation probes comprises theadapter sequence. This facilitates the creation of “universal arrays”.For example, as depicted in FIG. 10E, the first ligation probe has anadapter sequence that is used to attach the ligated probe to the array.

Again, as outlined above for SBE, unreacted ligation primers, may beremoved from the mixture as needed. For example, the first ligationprobe may comprise the label (either a primary or secondary label) andthe second may be blocked at its 3′ end with an exonuclease blockingmoiety; after ligation and the introduction of the nuclease, the labeledligation probe will be digested, leaving the ligation product and thesecond probe; however, since the second probe is unlabeled, it iseffectively silent in the assay. Similarly, the second probe maycomprise a binding partner used to pull out the ligated probes, leavingunligated labeled ligation probes behind. The binding pair is thendisassociated and added to the array.

Solid Phase Based OLA

Alternatively, the target nucleic acid is immobilized on a solid-phasesurface. The OLA assay is performed and unligated oligonucleotides areremoved by washing under appropriate stringency to remove unligatedoligonucleotides and thus the label. For example, as depicted in FIG.10B, the capture probe can comprise one of the ligation probes.Similarly, FIGS. 10e and 10D depict alternative attachments.

Again, as outlined above, the detection of the OLA reaction can alsooccur directly, in the case where one or both of the primers comprisesat least one detectable label, or indirectly, using sandwich assays,through the use of additional probes; that is, the ligated probes canserve as target sequences, and detection may utilize amplificationprobes, capture probes, capture extender probes, label probes, and labelextender probes, etc.

Solid Phase Oligonucleotide Ligation Assay (SPOLA)

In a preferred embodiment, a novel method of OLA is used, termed herein“solid phase oligonucleotide assay”, or “SPOLA”. In this embodiment, theligation probes are both attached to the same site on the surface of thearray (e.g. when microsphere arrays are used, to the same bead), one atits 5′ end (the “upstream probe”) and one at its 3′ end (the “downstreamprobe”), as is generally depicted in FIG. 11. This may be done as iswill be appreciated by those in the art. At least one of the probes isattached via a cleavable linker, that upon cleavage, forms a reactive ordetectable (fluorophore) moiety. If ligation occurs, the reactive moietyremains associated with the surface; but if no ligation occurs, due to amismatch, the reactive moiety is free in solution to diffuse away fromthe surface of the array. The reactive moiety is then used to add adetectable label.

Generally, as will be appreciated by those in the art, cleavage of thecleavable linker should result in asymmetrical products; i.e. one of the“ends” should be reactive, and the other should not, with theconfiguration of the system such that the reactive moiety remainsassociated with the surface if ligation occurred. Thus, for example,amino acids or succinate esters can be cleaved either enzymatically (viapeptidases (aminopeptidase and carboxypeptidase) or proteases) orchemically (acid/base hydrolysis) to produce an amine and a carboxylgroup. One of these groups can then be used to add a detectable label,as will be appreciated by those in the art and discussed herein.

Padlock Probe Ligation

In a preferred embodiment, the ligation probes are specialized probescalled “padlock probes”. Nilsson et al, 1994, Science 265:2085, herebyincorporated by reference. These probes have a first ligation domainthat is identical to a first ligation probe, in that it hybridizes to afirst target sequence domain, and a second ligation domain, identical tothe second ligation probe, that hybridizes to an adjacent targetsequence domain. Again, as for OLA, the detection position can be eitherat the 3′ end of the first ligation domain or at the 5′ end of thesecond ligation domain. However, the two ligation domains are connectedby a linker, frequently nucleic acid. The configuration of the system issuch that upon ligation of the first and second ligation domains of thepadlock probe, the probe forms a circular probe, and forms a complexwith the target sequence wherein the target sequence is “inserted” intothe loop of the circle.

In this embodiment, the unligated probes may be removed throughdegradation (for example, through a nuclease), as there are no “freeends” in the ligated probe.

Cleavage Techniques for Genotyping

In a preferred embodiment, the specificity for genotyping is provided bya cleavage enzyme. There are a variety of enzymes known to cleave atspecific sites, either based on sequence specificity, such asrestriction endonucleases, or using structural specificity, such as isdone through the use of invasive cleavage technology.

Endonuclease Techniques

In a preferred embodiment, enzymes that rely on sequence specificity areused. In general, these systems rely on the cleavage of double strandedsequence containing a specific sequence recognized by a nuclease,preferably an endonuclease including resolvases.

These systems may work in a variety of ways, as is generally depicted inFIG. 12. In one embodiment (FIG. 12A), a labeled readout probe(generally attached to a bead of the array) is used; the binding of thetarget sequence forms a double stranded sequence that a restrictionendonuclease can then recognize and cleave, if the correct sequence ispresent. An enzyme resulting in “sticky ends” is shown in FIG. 12A. Thecleavage results in the loss of the label, and thus a loss of signal.

Alternatively, as will be appreciated by those in the art, a labelledtarget sequence may be used as well; for example, a labelled primer maybe used in the PCR amplification of the target, such that the label isincorporated in such a manner as to be cleaved off by the enzyme.

Alternatively, the readout probe (or, again, the target sequence) maycomprise both a fluorescent label and a quencher, as is known in the artand depicted in FIG. 12B. In this embodiment, the label and the quencherare attached to different nucleosides, yet are close enough that thequencher molecule results in little or no signal being present. Upon theintroduction of the enzyme, the quencher is cleaved off, leaving thelabel, and allowing signalling by the label.

In addition, as will be appreciated by those in the art, these systemscan be both solution-based assays or solid-phase assays, as outlinedherein.

Furthermore, there are some systems that do not require cleavage fordetection; for example, some nucleic acid binding proteins will bind tospecific sequences and can thus serve as a secondary label. For example,some transcription factors will bind in a highly sequence dependentmanner, and can distinguish between two SNPs. Having bound to thehybridization complex, a detectable binding partner can be added fordetection. In addition, mismatch binding proteins based on mutatedtranscription factors can be used.

In addition, as will be appreciated by those in the art, this type ofapproach works with other cleavage methods as well, for example the useof invasive cleavage methods, as outlined below.

Invasive Cleavage

In a preferred embodiment, the determination of the identity of the baseat the detection position of the target sequence proceeds using invasivecleavage technology. As outlined above for amplification, invasivecleavage techniques rely on the use of structure-specific nucleases,where the structure can be formed as a result of the presence or absenceof a mismatch. Generally, invasive cleavage technology may be describedas follows. A target nucleic acid is recognized by two distinct probes.A first probe, generally referred to herein as an “invader” probe, issubstantially complementary to a first portion of the target nucleicacid. A second probe, generally referred to herein as a “signal probe”,is partially complementary to the target nucleic acid: the 3′ end of thesignal oligonucleotide is substantially complementary to the targetsequence while the 5′ end is non-complementary and preferably forms asingle-stranded “tail” or “arm”. The non-complementary end of the secondprobe preferably comprises a “generic” or “unique” sequence, frequentlyreferred to herein as a “detection sequence”, that is used to indicatethe presence or absence of the target nucleic acid, as described below.The detection sequence of the second probe preferably comprises at leastone detectable label. Alternative methods have the detection sequencefunctioning as a target sequence for a capture probe, and thus rely onsandwich configurations using label probes.

Hybridization of the first and second oligonucleotides near or adjacentto one another on the target nucleic acid forms a number of structures.In a preferred embodiment, a forked cleavage structure, as shown in FIG.13, forms and is a substrate of a nuclease which cleaves the detectionsequence from the signal oligonucleotide. The site of cleavage iscontrolled by the distance or overlap between the 3′ end of the invaderoligonucleotide and the downstream fork of the signal oligonucleotide.Therefore, neither oligonucleotide is subject to cleavage whenmisaligned or when unattached to target nucleic acid.

As above, the invasive cleavage assay is preferably performed on anarray format. In a preferred embodiment, the signal probe has adetectable label, attached 5′ from the site of nuclease cleavage (e.g.within the detection sequence) and a capture tag, as described hereinfor removal of the unreacted products (e.g. biotin or other hapten) 3′from the site of nuclease cleavage. After the assay is carried out, theuncleaved probe and the 3′ portion of the cleaved signal probe (e.g. thedetection sequence) may be extracted, for example, by binding tostreptavidin beads or by crosslinking through the capture tag to produceaggregates or by antibody to an attached hapten. By “capture tag” hereinis a meant one of a pair of binding partners as described above, such asantigen/antibody pairs, digoxygenenin, dinitrophenol, etc.

The cleaved 5′ region, e.g. the detection sequence, of the signal probe,comprises a label and is detected and optionally quantitated. In oneembodiment, the cleaved 5′ region is hybridized to a probe on an array(capture probe) and optically detected (FIG. 13). As described below,many different signal probes can be analyzed in parallel byhybridization to their complementary probes in an array. In a preferredembodiment as depicted in FIG. 13, combination techniques are used toobtain higher specificity and reduce the detection of contaminatinguncleaved signal probe or incorrectly cleaved product, an enzymaticrecognition step is introduced in the array capture procedure. Forexample, as more fully outlined below, the cleaved signal probe binds toa capture probe to produce a double-stranded nucleic acid in the array.In this embodiment, the 3′ end of the cleaved signal probe is adjacentto the 5′ end of one strand of the capture probe, thereby, forming asubstrate for DNA ligase (Broude et al. 1991. PNAS 91: 3072-3076). Onlycorrectly cleaved product is ligated to the capture probe. Otherincorrectly hybridized and non-cleaved signal probes are removed, forexample, by heat denaturation, high stringency washes, and other methodsthat disrupt base pairing.

Accordingly, the present invention provides methods of determining theidentity of a base at the detection position of a target sequence. Inthis embodiment, the target sequence comprises, 5′ to 3′, a first targetdomain comprising an overlap domain comprising at least a nucleotide inthe detection position, and a second target domain contiguous with thedetection position. A first probe (the “invader probe”) is hybridized tothe first target domain of the target sequence. A second probe (the“signal probe”), comprising a first portion that hybridizes to thesecond target domain of the target sequence and a second portion thatdoes not hybridize to the target sequence, is hybridized to the secondtarget domain. If the second probe comprises a base that is perfectlycomplementary to the detection position a cleavage structure is formed.The addition of a cleavage enzyme, such as is described in U.S. Pat.Nos. 5,846,717; 5,614,402; 5,719,029; 5,541,311 and 5,843,669, all ofwhich are expressly incorporated by reference, results in the cleavageof the detection sequence from the signaling probe. This then can beused as a target sequence in an assay complex.

In addition, as for a variety of the techniques outlined herein,unreacted probes (i.e. signaling probes, in the case of invasivecleavage), may be removed using any number of techniques. For example,the use of a binding partner (70 in FIG. 13C) coupled to a solid supportcomprising the other member of the binding pair can be done. Similarly,after cleavage of the primary signal probe, the newly created cleavageproducts can be selectively labeled at the 3′ or 5′ ends using enzymaticor chemical methods.

Again, as outlined above, the detection of the invasive cleavagereaction can occur directly, in the case where the detection sequencecomprises at least one label, or indirectly, using sandwich assays,through the use of additional probes; that is, the detection sequencescan serve as target sequences, and detection may utilize amplificationprobes, capture probes, capture extender probes, label probes, and labelextender probes, etc.

In addition, as for most of the techniques outlined herein, thesetechniques may be done for the two strands of a double-stranded targetsequence. The target sequence is denatured, and two sets of probes areadded: one set as outlined above for one strand of the target, and aseparate set for the other strand of the target.

Thus, the invasive cleavage reaction requires, in no particular order,an invader probe, a signaling probe, and a cleavage enzyme.

As for other methods outlined herein, the invasive cleavage reaction maybe done as a solution based assay or a solid phase assay.

Solution-Based Invasive Cleavage

The invasive cleavage reaction may be done in solution, followed byaddition of one of the components to an array, with optional (butpreferable) removal of unreacted probes. For example, as depicted inFIG. 13C, the reaction is carried out in solution, using a capture tag(i.e. a member of a binding partner pair) that is separated from thelabel on the detection sequence with the cleavage site. After cleavage(dependent on the base at the detection position), the signaling probeis cleaved. The capture tag is used to remove the uncleaved probes (forexample, using magnetic particles comprising the other member of thebinding pair), and the remaining solution is added to the array. FIG.13C depicts the direct attachment of the detection sequence to thecapture probe. In this embodiment, the detection sequence caneffectively act as an adapter sequence. In alternate embodiments, asdepicted in FIG. 13D, the detection sequence is unlabelled and anadditional label probe is used; as outlined below, this can be ligatedto the hybridization complex.

Solid-Phase Based Assays

The invasive cleavage reaction can also be done as a solid-phase assay.As depicted in FIG. 13A, the target sequence can be attached to thearray using a capture probe (in addition, although not shown, the targetsequence may be directly attached to the array). In a preferredembodiment, the signaling probe comprises both a fluorophore label(attached to the portion of the signaling probe that hybridizes to thetarget) and a quencher (generally on the detection sequence), with acleavage site in between. Thus, in the absence of cleavage, very littlesignal is seen due to the quenching reaction. After cleavage, however,the detection sequence is removed, along with the quencher, leaving theunquenched fluorophore. Similarly, the invasive probe may be attached tothe array, as depicted in FIG. 13B.

In a preferred embodiment, the invasive cleavage reaction is configuredto utilize a fluorophore-quencher reaction. A signaling probe comprisingboth a fluorophore and a quencher is attached to the bead. Thefluorophore is contained on the portion of the signaling probe thathybridizes to the target sequence, and the quencher is contained on aportion of the signaling probe that is on the other side of the cleavagesite (termed the “detection sequence” herein). In a preferredembodiment, it is the 3′ end of the signaling probe that is attached tothe bead (although as will be appreciated by those in the art, thesystem can be configured in a variety of different ways, includingmethods that would result in a loss of signal upon cleavage). Thus, thequencher molecule is located 5′ to the cleavage site. Upon assembly ofan assay complex, comprising the target sequence, an invader probe, anda signaling probe, and the introduction of the cleavage enzyme, thecleavage of the complex results in the disassociation of the quencherfrom the complex, resulting in an increase in fluorescence.

In this embodiment, suitable fluorophore-quencher pairs are as known inthe art. For example, suitable quencher molecules comprise Dabcyl.

Combination Techniques

It is also possible to combine two or more of these techniques to dogenotyping, quantification, detection of sequences, etc.

Novel Combination of Competitive Hybridization and Extension

In a preferred embodiment, a combination of competitive hybridizationand extension, particularly SBE, is used. This may be generallydescribed as follows. In this embodiment, different extension primerscomprising different bases at the readout position are used, These arehybridized to a target sequence under stringency conditions that favorperfect matches, and then an extension reaction is done. Basically, thereadout probe that has the match at the readout position will bepreferentially extended for two reasons; first, the readout probe willhybridize more efficiently to the target (e.g. has a slower off rate),and the extension enzyme will preferentially add a nucleotide to a“hybridized” base. The reactions can then be detected in a number ofways, as outlined herein.

The system can take on a number of configurations, depending on thenumber of labels used, the use of adapters, whether a solution-based orsurface-based assay is done, etc. Several preferred embodiments areshown in FIG. 14.

In a preferred embodiment, at least two different readout probes areused, each with a different base at the readout position and each with aunique detectable label that allows the identification of the base atthe readout position. As described herein, these detectable labels maybe either primary or secondary labels, with primary labels beingpreferred. As for all the competitive hybridization reactions, acompetition for hybridization exists with the reaction conditions beingset to favor match over mismatch. When the correct match occurs, the 3′end of the hybridization complex is now double stranded and thus servesas a template for an extension enzyme to add at least one base to theprobe, at a position adjacent to the readout position. As will beappreciated by those in the art, for most SNP analysis, the nucleotidenext to the detection position will be the same in all the reactions.

In one embodiment, chain terminating nucleotides may be used;alternatively, non-terminating nucleotides may be used and multiplenucleotides may be added, if desired. The latter may be particularlypreferred as an amplification step of sorts; if the nucleotides arelabelled, the addition of multiple labels can result in signalamplification.

In a preferred embodiment, the nucleotides are analogs that allowseparation of reacted and unreacted primers as described herein; forexample, this may be done by using a nuclease blocking moiety to protectextended primers and allow preferentially degradation of unextendedprimers or biotin (or iminobiotin) to preferentially remove the extendedprimers (this is done in a solution based assay, followed by elution andaddition to the array).

As for the other reactions outlined herein, this may be done as asolution based assay, or a solid phase assay. Solution based assays aregenerally depicted in FIGS. 14A, 14B and 14C. In a solid phase reaction,an example of which is depicted in FIG. 14D, the capture probe serves asthe readout probe; in this embodiment, different positions on the array(e.g. different beads) comprise different readout probes. That is, atleast two different capture/readout probes are used, with three and fouralso possible, depending on the allele. The reaction is run underconditions that favor the formation of perfect match hybridizationcomplexes. In this embodiment, the dNTPs comprise a detectable label,preferably a primary label such as a fluorophore. Since the competitivereadout probes are spatially defined in the array, one fluorescent labelcan distinguish between the alleles; furthermore, it is the samenucleotide that is being added in the reaction, since it is the positionadjacent to the SNP that is being extended. As for all the competitiveassays, relative fluorescence intensity distinguishes between thealleles and between homozygosity and heterozygosity. In addition,multiple extension reactions can be done to amplify the signal.

For both solution and solid phase reactions, adapters may beadditionally used. In a preferred embodiment, as shown in FIG. 14B forthe solution based assay (although as will be appreciated by those inthe art, a solid phase reaction may be done as well), adapters on the 5′ends of the readout probes are used, with identical adapters used foreach allele. Each readout probe has a unique detectable label thatallows the determination of the base at the readout position. Afterhybridization and extension, the readout probes are added to the array;the adapter sequences direct the probes to particular array locationsand the relative intensities of the two labels distinguishes betweenalleles.

Alternatively, as depicted in FIG. 14C for the solution based assay(although as will be appreciated by those in the art, a solid phasereaction may be done as well), a different adapter may be used for eachreadout probe. In this embodiment, a single label may be used, sincespatial resolution is used to distinguish the alleles by having a uniqueadapter attached to each allelic probe. After hybridization andextension, the readout probes are added to the array; the unique adaptersequences direct the probes to unique array locations. In thisembodiment, it is the relative intensities of two array positions thatdistinguishes between alleles.

As will be appreciated by those in the art, any array may be used inthis novel method, including both ordered and random arrays. In apreferred embodiment, the arrays may be made through spottingtechniques, photolithographic techniques, printing techniques, orpreferably are bead arrays.

Combination of Competitive Hybridization and Invasive Cleavage

In a preferred embodiment, a combination of competitive hybridizationand invasive cleavage is done. As will be appreciated by those in theart, this technique is invasive cleavage as described above, with atleast two sets of probes comprising different bases in the readoutposition. By running the reactions under conditions that favorhybridization complexes with perfect matches, different alleles may bedistinguished.

In a preferred embodiment, this technique is done on bead arrays.

Novel Combination of Invasive Cleavage and Ligation

In a preferred embodiment, invasive cleavage and ligation is done, as isgenerally depicted in FIG. 15. In this embodiment, the specificity ofthe invasive cleavage reaction is used to detect the nucleotide in thedetection position, and the specificity of the ligase reaction is usedto ensure that only cleaved probes give a signal; that is, the ligationreaction confers an extra level of specificity.

The detection sequence, comprising a detectable label, of the signalprobe is cleaved if the correct basepairing is present, as outlinedabove. The detection sequence then serves as the “target sequence” in asecondary reaction for detection; it is added to a capture probe on a.The capture probe in this case comprises a first double stranded portionand a second single stranded portion that will hybridize to thedetection sequence. Again, preferred embodiments utilize adjacentportions, although dNTPs and a polymerase to fill in the “gap” may alsobe done. A ligase is then added. As shown in FIG. 15A, only if thesignal probe has been cleaved will ligation occur; this results incovalent attachment of the signal probe to the array. This may bedetected as outlined herein; preferred embodiments utilize stringencyconditions that will discriminate between the ligated and unligatedsystems.

As will be appreciated by those in the art, this system may take on anumber of configurations, including solution based and solid basedassays. In a preferred embodiment, as outlined above, the system isconfigured such that only if cleavage occurs will ligation happen. In apreferred embodiment, this may be done using blocking moieties; thetechnique can generally be described as follows. An invasive cleavagereaction is done, using a signaling probe that is blocked at the 3′ end.Following cleavage, which creates a free 3′ terminus, a ligationreaction is done, generally using a template target and a secondligation probe comprising a detectable label. Since the signaling probehas a blocked 3′ end, only those probes undergoing cleavage get ligatedand labelled.

Alternatively, the orientations may be switched; in this embodiment, afree 5′ phosphate is generated and is available for labeling.

Accordingly, in this embodiment, a solution invasive cleavage reactionis done (although as will be appreciated by those in the art, a supportbound invasive cleavage reaction may be done as well).

As will be appreciated by those in the art, any array may be used inthis novel method, including both ordered (predefined) and randomarrays. In a preferred embodiment, the arrays may be made throughspotting techniques, photolithographic techniques, printing techniques,or preferably are bead arrays.

Combination of Invasive Cleavage and Extension

In a preferred embodiment, a combination of invasive cleavage andextension reactions are done, as generally depicted in FIG. 16A. Thetechnique can generally be described as follows. An invasive cleavagereaction is done, using a signalling probe that is blocked at the 3′end. Following cleavage, which creates a free 3′ terminus, an extensionreaction is done (either enzymatically or chemically) to add adetectable label. Since the signaling probe has a blocked 3′ end, onlythose probes undergoing cleavage get labelled.

Alternatively, the orientations may be switched, for example whenchemical extension or labeling is done. In this embodiment, a free 5′phosphate is generated and is available for labeling.

In a preferred embodiment, the invasive cleavage reaction is configuredas shown in FIG. 16B. In this embodiment, the signaling probe isattached to the array at the 5′ end (e.g. to the detection sequence) andcomprises a blocking moiety at the 3′ end. The blocking moiety serves toprevent any alteration (including either enzymatic alteration orchemical alteration) of the 3′ end. Suitable blocking moieties include,but are not limited to, chain terminators, alkyl groups, halogens;basically any non-hydroxy moiety.

Upon formation of the assay complex comprising the target sequence, theinvader probe, and the signaling probe, and the introduction of thecleavage enzyme, the portion of the signaling probe comprising theblocking moiety is removed. As a result, a free 3′ OH group isgenerated. This can be extended either enzymatically or chemically, toincorporate a detectable label. For example, enzymatic extension mayoccur. In a preferred embodiment, a non-templated extension occurs, forexample, through the use of terminal transferase. Thus, for example, amodified dNTP may be incorporated, wherein the modification comprisesthe presence of a primary label such as a fluor, or a secondary labelsuch as biotin, followed by the addition of a labeled streptavidin, forexample. Similarly, the addition of a template (e.g. a secondary targetsequence that will hybridize to the detection sequence attached to thebead) allows the use of any number of reactions as outlined herein, suchas simple extension, SBE, pyrosequencing, OLA, etc. Again, thisgenerally (but not always) utilizes the incorporation of a label intothe growing strand.

Alternatively, as will be appreciated by those in the art, chemicallabelling or extension methods may be used to label the 3′ OH group.

As for all the combination methods, there are several advantages to thismethod. First of all, the absence of any label on the surface prior tocleavage allows a high signal-to-noise ratio. Additionally, thesignaling probe need not contain any labels, thus making synthesiseasier. Furthermore, because the target-specific portion of thesignaling probe is removed during the assay, the remaining detectionsequence can be any sequence. This allows the use of a common sequencefor all beads; even if different reactions are carried out in parallelon the array, the post-cleavage detection can be identical for allassays, thus requiring only one set of reagents. As will be appreciatedby those in the art, it is also possible to have different detectionsequences if required. In addition, since the label is attachedpost-cleavage, there is a great deal of flexibility in the type of labelthat may be incorporated. This can lead to significant signalamplification; for example, the use of highly labeled streptavidin boundto a biotin on the detection sequence can give an increased signal perdetection sequence. Similarly, the use of enzyme labels such as alkalinephosphatase or horseradish peroxidase allow signal amplification aswell.

A further advantage is the two-fold specificity that is built into theassay. By requiring specificity at the. cleavage step, followed byspecificity at the extension step, increased signal-to-noise ratios areseen.

As will be appreciated by those in the art, while generally described asa solid phase assay, this reaction may also be done in solution; this issimilar to the solution-based SBE reactions, wherein the detectionsequence serves as the extension primer. This assay also may beperformed with an extension primer/adaptor oligonucleotide as describedfor solution-based SBE assays. It should be noted that the arrays usedto detect the invasive cleavage/extension reactions may be of any type,including, but not limited to, spotted and printed arrays,photolithographic arrays, and bead arrays.

Combination of Ligation and Extension

In a preferred embodiment, OLA and SBE are combined, as is sometimesreferred to as “Genetic Bit” analysis and described in Nikforov et al.,Nucleic Acid Res. 22:4167 (1994), hereby expressly incorporated byreference. In this embodiment, the two ligation probes do not hybridizeadjacently; rather, they are separated by one or more bases. Theaddition of dNTPs and a polymerase, in addition to the ligation probesand the ligase, results in an extended, ligated probe. As for SBE, thedNTPs may carry different labels, or separate reactions can be run, ifthe SBE portion of the reaction is used for genotyping. Alternatively,if the ligation portion of the reaction is used for genotyping, eitherno extension occurs due to mismatch of the 3′ base (such that thepolymerase will not extend it), or no ligation occurs due to mismatch ofthe 5′ base. As will be appreciated by those in the art, the reactionproducts are assayed using microsphere arrays. Again, as outlinedherein, the assays may be solution based assays, with the ligated,extended probes being added to a microsphere array, or solid-phaseassays. In addition, the unextended, unligated primers may be removedprior to detection as needed, as is outlined herein. Furthermore,adapter sequences may also be used as outlined herein for OLA.

Combination of OLA and PCR

In a preferred embodiment, OLA and PCR are combined. As will beappreciated by those in the art, the sequential order of the reaction isvariable. That is, in some embodiments it is desired to perform thegenotyping or OLA reaction first followed by PCR amplification. In analternative embodiment, it is desirable to first amplify the target i.e.by PCR followed by the OLA assay.

In a preferred embodiment, this technique is done on bead arrays.

Combination of Competitive Hybridization and Ligation

In a preferred embodiment, a combination of competitive hybridizationand ligation is done. As will be appreciated by those in the art, thistechnique is OLA as described above, with at least two sets of probescomprising different bases in the readout position. By running thereactions under conditions that favor hybridization complexes withperfect matches, different alleles may be distinguished.

In one embodiment, LCR is used to genotype a single genomic locus byincorporating two sets of two optically labeled AS oligonucleotides anda detection oligonucleotide in the ligation reaction. Theoligonucleotide ligation step discriminates between the ASoligonucleotides through the efficiency of ligation between anoligonucleotide with a correct match with the target nucleic acid versusa mismatch base in the target nucleic acid at the ligation site.Accordingly, a detection oligonucleotide ligates efficiently to an ASoligonucleotide if there is complete base pairing at the ligation site.One.3′ oligonucleotide (T base at 5′ end) is optically labeled with FAM(green fluorescent dye) and the other 3′ oligonucleotide (C base at 5′end) is labelled with TMR (yellow fluorescent dye). An A base in thetarget nucleic acid base pairs with the corresponding T resulting inefficient ligation of the FAM-labeled oligonucleotide. A G base in thetarget nucleic acid results in ligation of the TMR-labeledoligonucleotide. TMR and FAM have distinct emission spectrums.Accordingly, the wavelength of the oligonucleotide ligated to the 5′detection oligonucleotide indicates the nucleotide and thus the genotypeof the target nucleic acid.

In a preferred embodiment, this technique is done on bead arrays.

Combination of Competitive Hybridization and Invasive Cleavage

In a preferred embodiment, a combination of competitive hybridizationand invasive cleavage is done. As will be appreciated by those in theart, this technique is invasive cleavage as described above, with atleast two sets of probes (either the invader probes or the signalingprobes) comprising different bases in the readout position. By runningthe reactions under conditions that favor hybridization complexes withperfect matches, different alleles may be distinguished.

In a preferred embodiment, this technique is done on bead arrays.

In addition to the amplification and genotyping embodiments disclosedherein, the present invention further provides compositions and methodsfor nucleic acid sequencing.

Sequencing

The present invention is directed to the sequencing of nucleic acids,particularly DNA, by synthesizing nucleic acids using the targetsequence (i.e. the nucleic acid for which the sequence is determined) asa template. These methods can be generally described as follows. Atarget sequence is attached to a solid support, either directly orindirectly, as outlined below. The target sequence comprises a firstdomain and an adjacent second domain comprising target positions forwhich sequence information is desired. A sequencing primer is hybridizedto the first domain of the target sequence, and an extension enzyme isadded, such as a polymerase or a ligase, as outlined below. After theaddition of each base, the identity of each newly added base isdetermined prior to adding the next base. This can be done in a varietyof ways, including controlling the reaction rate and using a fastdetector, such that the newly added bases are identified in real time.Alternatively, the addition of nucleotides is controlled by reversiblechain termination, for example through the use of photo cleavableblocking groups. Alternatively, the addition of nucleotides iscontrolled, so that the reaction is limited to one or a few bases at atime. The reaction is restarted after each cycle of addition andreading. Alternatively, the addition of nucleotides is accomplished bycarrying out a ligation reaction with oligonucleotides comprising chainterminating oligonucleotides. Preferred methods ofsequencing-by-synthesis include, but are not limited to, pyrosequencing,reversible-chain termination sequencing, time-resolved sequencing,ligation sequencing, and single-molecule analysis, all of which aredescribed below.

The advantages of these “sequencing-by-synthesis” reactions can beaugmented through the use of array techniques that allow very highdensity arrays to be made rapidly and inexpensively, thus allowing rapidand inexpensive nucleic acid sequencing. By “array techniques” is meanttechniques that allow for analysis of a plurality of nucleic acids in anarray format. The maximum number of nucleic acids is limited only by thenumber of discrete loci on a particular array platform. As is more fullyoutlined below, a number of different array formats can be used.

The methods of the invention find particular use in sequencing a targetnucleic acid sequence, i.e. identifying the sequence of a target base ortarget bases in a target nucleic acid, which can ultimately be used todetermine the sequence of long nucleic acids.

As is outlined herein, the target sequence comprises positions for whichsequence information is desired, generally referred to herein as the“target positions”. In one embodiment, a single target position iselucidated; in a preferred embodiment, a plurality of target positionsare elucidated. In general, the plurality of nucleotides in the targetpositions are contiguous with each other, although in some circumstancesthey may be separated by one or more nucleotides. By “plurality” as usedherein is meant at least two. As used herein, the base which basepairswith the target position base in a hybrid is termed the “sequenceposition”. That is, as more fully outlined below, the extension of asequence primer results in nucleotides being added in the sequencepositions, that are perfectly complementary to the nucleotides in thetarget positions. As will be appreciated by one of ordinary skill in theart, identification of a plurality of target positions in a targetnucleotide sequence results in the determination of the nucleotidesequence of the target nucleotide sequence.

As will be appreciated by one of ordinary skill in the art, this systemcan take on a number of different configurations, depending on thesequencing method used, the method of attaching a target sequence to asurface, etc. In general, the methods of the invention rely on theattachment of different target sequences to a solid support (which, asoutlined below, can be accomplished in a variety of ways) to form anarray. The target sequences comprise at least two domains: a firstdomain, for which sequence information is not desired, and to which asequencing primer can hybridize, and a second domain, adjacent to thefirst domain, comprising the target positions for sequencing. Asequencing primer is hybridized to the target sequence, forming ahybridization complex, and then the sequencing primer is enzymaticallyextended by the addition of a first nucleotide into the first sequenceposition of the primer. This first nucleotide is then identified, as isoutlined below, and then the process is repeated, to add nucleotides tothe second, third, fourth, etc. sequence positions. The exact methodsdepend on the sequencing technique utilized, as outlined below.

Once the target sequence is associated onto the array as outlined below,the target sequence can be used in a variety of sequencing by synthesisreactions. These reactions are generally classified into severalcategories, outlined below.

Sequencing by Synthesis

As outlined herein, a number of sequencing by synthesis reactions areused to elucidate the identity of a plurality of bases at targetpositions within the target sequence. All of these reactions rely on theuse of a target sequence comprising at least two domains; a first domainto which a sequencing primer will hybridize, and an adjacent seconddomain, for which sequence information is desired. Upon formation of theassay complex, extension enzymes are used to add dNTPs to the sequencingprimer, and each addition of dNTP is “read” to determine the identity ofthe added dNTP. This may proceed for many cycles.

Pyrosequencing

In a preferred embodiment, pyrosequencing methods are done to sequencethe nucleic acids. As outlined above, pyrosequencing is an extensionmethod that can be used to add one or more nucleotides to the targetpositions. Pyrosequencing relies on the detection of a reaction product,pyrophosphate (PPi), produced during the addition of an NTP to a growingoligonucleotide chain, rather than on a label attached to thenucleotide. One molecule of PPi is produced per dNTP added to theextension primer. The detection of the PPi produced during the reactionis monitored using secondary enzymes; for example, preferred embodimentsutilize secondary enzymes that convert the PPi into ATP, which also maybe detected in a variety of ways, for example through a chemiluminescentreaction using luciferase and luciferin, or by the detection of NADPH.Thus, by running sequential reactions with each of the nucleotides, andmonitoring the reaction products, the identity of the added base isdetermined.

Accordingly, the present invention provides methods of pyrosequencing onarrays; the arrays may be any number of different array configurationsand substrates, as outlined herein, with microsphere arrays beingparticularly preferred. In this embodiment, the target sequencecomprises a first domain that is substantially complementary to asequencing primer, and an adjacent second domain that comprises aplurality of target positions. By “sequencing primer” herein is meant anucleic acid that is substantially complementary to the first targetdomain, with perfect complementarity being preferred. As will beappreciated by those in the art, the length of the sequencing primerwill vary with the conditions used. In general, the sequencing primerranges from about 6 to about 500 or more basepairs in length, with fromabout 8 to about 100 being preferred, and from about 10 to about 25being especially preferred.

Once the sequencing primer is added and hybridized to the targetsequence to form a first hybridization complex (also sometimes referredto herein as an “assay complex”), the system is ready to initiatesequencing-by-synthesis. The methods described below make reference tothe use of fiber optic bundle substrates with associated s, but as willbe appreciated by those in the art, any number of other substrates orsolid supports may be used, or arrays that do not comprise microspheres.

The reaction is initiated by introducing the substrate comprising thehybridization complex comprising the target sequence (i.e. the array) toa solution comprising a first nucleotide, generally comprisingdeoxynucleoside-triphosphates (dNTPs). Generally, the dNTPs comprisedATP, dTTP, dCTP and dGTP. The nucleotides may be naturally occurring,such as deoxynucleotides, or non-naturally occurring, such as chainterminating nucleotides including dideoxynucleotides, as long as theenzymes used in the sequencing/detection reactions are still capable ofrecognizing the analogs. In addition, as more fully outlined below, forexample in other sequencing-by-synthesis reactions, the nucleotides maycomprise labels. The different dNTPs are added either to separatealiquots of the hybridization complex or preferably sequentially to thehybridization complex, as is more fully outlined below. In someembodiments it is important that the hybridization complex be exposed toa single type of dNTP at a time.

In addition, as will be appreciated by those in the art, the extensionreactions of the present invention allow the precise incorporation ofmodified bases into a growing nucleic acid strand. Thus, any number ofmodified nucleotides may be incorporated for any number of reasons,including probing structure-function relationships (e.g. DNA:DNA orDNA:protein interactions), cleaving the nucleic acid, crosslinking thenucleic acid, incorporate mismatches, etc.

In addition to a first nucleotide, the solution also comprises anextension enzyme, generally a DNA polymerase. Suitable DNA polymerasesinclude, but are not limited to, the Klenow fragment of DNA polymeraseI, SEQUENASE 1.0 and SEQUENASE 2.0 (U.S. Biochemical), T5 DNA polymeraseand Phi29 DNA polymerase. If the dNTP is complementary to the base ofthe target sequence adjacent to the extension primer, the extensionenzyme will add it to the extension primer, releasing pyrophosphate(PPi). Thus, the extension primer is modified, i.e. extended, to form amodified primer, sometimes referred to herein as a “newly synthesizedstrand”. The incorporation of a dNTP into a newly synthesized nucleicacid strand releases PPi, one molecule of PPi per dNTP incorporated.

The release of pyrophosphate (PPi) during the DNA polymerase reactioncan be quantitatively measured by many different methods and a number ofenzymatic methods have been described; see Reeves at al., Anal. Biochem.28:282 (1969); Guillory et al., Anal. Biochem. 39:170 (1971); Johnson etal., Anal. Biochem. 15:273 (1968); Cook at al., Anal. Biochem. 91:557(1978); Drake at al., Anal. Biochem. 94: 117 (1979); Ronaghi et al.Science 281:363 (1998); Barshop et al., Anal. Biochem. 197(1):266-272(1991) WO93/23564; WO 98/28440; WO98/13523; Nyren et al., Anal. Biochem.151:504 (1985); all of which are incorporated by reference. The lattermethod allows continuous monitoring of PPi and has been termed ELIDA(Enzymatic Luminometric Inorganic Pyrophosphate Detection Assay). In apreferred embodiment, the PPi is detected utilizing UDP-glucosepyrophosphorylase, phosphoglucomutase and glucose 6-phosphatedehydrogenase. See Justesen, et al., Anal. Biochem. 207(1):90-93 (1992);Lust et al., Clin. Chem. Acta 66(2):241 (1976); and Johnson et al.,Anal. Biochem. 26: 137 (1968); all of which are hereby incorporated byreference. This reaction produces NADPH which can be detectedfluoremetrically. A preferred embodiment utilizes any method which canresult in the generation of an optical signal, with preferredembodiments utilizing the generation of a chemiluminescent orfluorescent signal.

Generally, these methods rely on secondary enzymes to detect the PPi;these methods generally rely on enzymes that will convert PPi into ATP,which can then be detected. A preferred method monitors the creation ofPPi by the conversion of PPi to ATP by the enzyme sulfurylase, and thesubsequent production of visible light by firefly luciferase (seeRonaghi at al., supra, and Barshop, supra). In this method, the fourdeoxynucleotides (dATP, dGTP, dCTP and dTTP; collectively dNTPs) areadded stepwise to a partial duplex comprising a sequencing primerhybridized to a single stranded DNA template and incubated with DNApolymerase, ATP sulfurylase (and its substrate, adenosine5′-phosphosulphate (APS)) luciferase (and its substrate luciferin), andoptionally a nucleotide-degrading enzyme such as apyrase. A dNTP is onlyincorporated into the growing DNA strand if it is complementary to thebase in the template strand. The synthesis of DNA is accompanied by therelease of PPi equal in molarity to the incorporated dNTP. The PPi isconverted to ATP and the light generated by the luciferase is directlyproportional to the amount of ATP. In some cases the unincorporateddNTPs and the produced ATP are degraded between each cycle by thenucleotide degrading enzyme.

As will be appreciated by those in the art, if the target sequencecomprises two or more of the same nucleotide in a row, more than onedNTP will be incorporated; however, the amount of PPi generated isdirectly proportional to the number of dNTPs incorporated and thus thesesequences can be detected.

In addition, in a preferred embodiment, the dATP that is added to thereaction mixture is an analog that can be incorporated by the DNApolymerase into the growing oligonucleotide strand, but will not serveas a substrate for the second enzyme; for example, certainthiol-containing dATP analogs find particular use.

Accordingly, a preferred embodiment of the methods of the invention isas follows. A substrate comprising s containing the target sequences andextension primers, forming hybridization complexes, is dipped orcontacted with a volume (reaction chamber or well) comprising a singletype of dNTP, an extension enzyme, and the reagents and enzymesnecessary to detect PPi. If the dNTP is complementary to the base of thetarget portion of the target sequence adjacent to the extension primer,the dNTP is added, releasing PPi and generating detectable light, whichis detected as generally described in U.S. Ser. Nos. 09/151,877 and09/189,543, and PCT US98/09163, all of which are hereby incorporated byreference. If the dNTP Is not complementary, no detectable signalresults. The substrate is then contacted with a second reaction chambercomprising a different dNTP and the additional components of the assay.This process is repeated to generate a readout of the sequence of thetarget sequence.

In a preferred embodiment, washing steps, i.e. the use of washingchambers, may be done in between the dNTP reaction chambers, asrequired. These washing chambers may optionally comprise anucleotide-degrading enzyme, to remove any unreacted dNTP and decreasingthe background signal, as is described in WO 98/28440, incorporatedherein by reference. In a preferred embodiment a flow cell is used as areaction chamber; following each reaction the unreacted dNTP is washedaway and may be replaced with an additional dNTP to be examined.

As will be appreciated by those in the art, the system can be configuredin a variety of ways, including both a linear progression or a circularone; for example, four substrates may be used that each can dip into oneof four reaction chambers arrayed in a circular pattern. Each cycle ofsequencing and reading is followed by a 90 degree rotation, so that eachsubstrate then dips into the next reaction well. This allows acontinuous series of sequencing reactions on multiple substrates inparallel.

In a preferred embodiment, one or more internal control sequences areused. That is, at least one microsphere in the array comprises a knownsequence that can be used to verify that the reactions are proceedingcorrectly. In a preferred embodiment, at least four control sequencesare used, each of which has a different nucleotide at each position: thefirst control sequence will have an adenosine at position 1, the secondwill have a cytosine, the third a guanosine, and the fourth a thymidine,thus ensuring that at least one control sequence is “lighting up” ateach step to serve as an internal control.

In a preferred embodiment, the reaction is run for a number of cyclesuntil the signal-to-noise ratio becomes low, generally from 20 to 70cycles or more, with from about 30 to 50 being standard. in someembodiments, this is sufficient for the purposes of the experiment; forexample, for the detection of certain mutations, including singlenucleotide polymorphisms (SNPs), the experiment is designed such thatthe initial round of sequencing gives the desired information. In otherembodiments, it is desirable to sequence longer targets, for example inexcess of hundreds of bases. In this application, additional rounds ofsequencing can be done.

For example, after a certain number of cycles, it is possible to stopthe reaction, remove the newly synthesized strand using either a thermalstep or a chemical wash, and start the reaction over, using for examplethe sequence information that was previously generated to make a newextension primer that will hybridize to the first target portion of thetarget sequence. That is, the sequence information generated in thefirst round is transferred to an oligonucleotide synthesizer, and asecond extension primer is made for a second round of sequencing. Inthis way, multiple overlapping rounds of sequencing are used to generatelong sequences from template nucleic acid strands. Alternatively, when asingle target sequence contains a number of mutational “hot spots”,primers can be generated using the known sequences in between these hotspots.

Additionally, the methods of the invention find use in the decoding ofrandom microsphere arrays. That is, as described in U.S. Ser. No.09/189,543, nucleic acids can be used as bead identifiers. By usingsequencing-by-synthesis to read out the sequence of the nucleic acids,the beads can be decoded in a highly parallel fashion.

In addition, the methods find use in simultaneous analysis of multipletarget sequence positions on a single array. For example, four separatesequence analysis reactions are performed. In the first reaction,positions containing a particular nucleotide (“A”, for example) in thetarget sequence are analyzed. In three other reactions, C, G, and T areanalyzed. An advantage of analyzing one base per reaction is that thebaseline or background is flattened for the three bases excluded fromthe reaction. Therefore, the signal is more easily detected and thesensitivity of the assay is increased. Alternatively, each of the foursequencing reactions (A, G, C and T) can be performed simultaneouslywith a nested set of primers providing a significant advantage in thatprimer synthesis can be made more efficient.

In another preferred embodiment each probe is represented by multiplebeads in the array (see U.S. Ser. No. 09/287,573, filed Apr. 6, 1999,hereby expressly incorporated by reference). As a result, eachexperiment can be replicated many times in parallel. As outlined below,averaging the signal from each respective probe in an experiment alsoallows for improved signal to noise and increases the sensitivity ofdetecting subtle perturbations in signal intensity patterns. The use ofredundancy and comparing the patterns obtained from two differentsamples (e.g. a reference and an unknown), results in highly paralleledand comparative sequence analysis that can be performed on complexnucleic acid samples.

As outlined herein, the pyrosequencing systems may be configured in avariety of ways; for example, the target sequence may be attached to thearray (e.g. the beads) in a variety of ways, including the directattachment of the target sequence to the array; the use of a captureprobe with a separate extension probe; the use of a capture extenderprobe, a capture probe and a separate extension probe; the use ofadapter sequences in the target sequence with capture and extensionprobes; and the use of a capture probe that also serves as the extensionprobe.

In addition, as will be appreciated by those in the art, the targetsequence may comprise any number of sets of different first and secondtarget domains; that is, depending on the number of target positionsthat may be elucidated at a time, there may be several “rounds” ofsequencing occurring, each time using a different target domain.

One additional benefit of pyrosequencing for genotyping purposes is thatsince the reaction does not rely on the incorporation of labels into agrowing chain, the unreacted extension primers need not be removed.

Thus, pyrosequencing kits and reactions require, in no particularlyorder, arrays comprising capture probes, sequencing primers, anextension enzyme, and secondary enzymes and reactants for the detectionof PPi, generally comprising enzymes to convert PPi into ATP (or otherNTPs), and enzymes and reactants to detect ATP.

Attachment of Enzymes to Arrays

In a preferred embodiment, particularly when secondary enzymes (i.e.enzymes other than extension enzymes) are used in the reaction, theenzyme(s) may be attached, preferably through the use of flexiblelinkers, to the sites on the array, e.g. the beads. For example, whenpyrosequencing is done, one embodiment utilizes detection based on thegeneration of a chemiluminescent signal in the “zone” around the bead.By attaching the secondary enzymes required to generate the signal, anincreased concentration of the required enzymes is obtained in theimmediate vicinity of the reaction, thus allowing for the use of lessenzyme and faster reaction rates for detection. Thus, preferredembodiments utilize the attachment, preferably covalently (although aswill be appreciated by those in the art, other attachment mechanisms maybe used), of the non-extension secondary enzymes used to generate thesignal. In some embodiments, the extension enzyme (e.g. the polymerase)may be attached as well, although this is not generally preferred.

The attachment of enzymes to array sites, particularly beads, isoutlined in U.S. Ser. No. 09/287,573, hereby incorporated by reference,and will be appreciated by those in the art. in general, the use offlexible linkers are preferred, as this allows the enzymes to interactwith the substrates. However, for some types of attachment, linkers arenot needed. Attachment proceeds on the basis of the composition of thearray site (i.e. either the substrate or the bead, depending on whicharray system is used) and the composition of the enzyme. In a preferredembodiment, depending on the composition of the array site (e.g. thebead), it will contain chemical functional groups for subsequentattachment of other moieties. For example, beads comprising a variety ofchemical functional groups such as amines are commercially available.Preferred functional groups for attachment are amino groups, carboxygroups, oxo groups and thiol groups, with amino groups beingparticularly preferred. Using these functional groups, the enzymes canbe attached using functional groups on the enzymes. For example, enzymescontaining amino groups can be attached to particles comprising aminogroups, for example using linkers as are known in the art; for example,homo- or hetero-bifunctional linkers as are well known (see 1994 PierceChemical Company catalog, technical section on cross-linkers, pages155-200, incorporated herein by reference).

Reversible Chain Termination Methods

In a preferred embodiment, the sequencing-by-synthesis method utilizedis reversible chain termination. In this embodiment, the rate ofaddition of dNTPs is controlled by using nucleotide analogs that containa removable protecting group at the 3′ position of the dNTP. Thepresence of the protecting group prevents further addition of dNTPs atthe 3′ end, thus allowing time for detection of the nucleotide added(for example, utilizing a labeled dNTP). After acquisition of theidentity of the dNTP added, the protecting group is removed and thecycle repeated. In this way, dNTPs are added one at a time to thesequencing primer to allow elucidation of the nucleotides at the targetpositions. See U.S. Pat. Nos. 5,902,723; 5,547,839; Metzker et al.,Nucl. Acid Res. 22(20):4259 (1994); Canard et al., Gene 148(1):1-6(1994); Dyatkina et al., Nucleic Acid Symp. Ser. 18:117-120 (1987); allof which are hereby expressly incorporated by reference.

Accordingly, the present invention provides methods and compositions forreversible chain termination sequencing-by-synthesis. Similar topyrosequencing, the reaction requires the hybridization of asubstantially complementary sequencing primer to a first target domainof a target sequence to form an assay complex.

The reaction is initiated by introducing the assay complex comprisingthe target sequence (i.e. the array) to a solution comprising a firstnucleotide analog. By “nucleotide analog” in this context herein ismeant a deoxynucleoside-triphosphate (also called deoxynucleotides ordNTPs, i.e. dATP, dTTP, dCTP and dGTP), that is further derivatized tobe reversibly chain terminating. As will be appreciated by those in theart, any number of nucleotide analogs may be used, as long as apolymerase enzyme will still incorporate the nucleotide at the sequenceposition. Preferred embodiments utilize 3′-0-methyl-dNTPs (withphotolytic removal of the protecting group), 3′-substituted-2′-dNTPsthat contain anthranylic derivatives that are fluorescent (with alkalior enzymatic treatment for removal of the protecting group). The latterhas the advantage that the protecting group is also the fluorescentlabel; upon cleavage, the label is also removed, which may serve togenerally lower the background of the assay as well.

Again, the system may be configured and/or utilized in a number of ways.In a preferred embodiment, a set of nucleotide analogs such asderivatized dATP, derivatized dCTP, derivatized dGTP and derivatizeddTTP is used, each with a different detectable and resolvable label, asoutlined below. Thus, the identification of the base at the firstsequencing position can be ascertained by the presence of the uniquelabel.

Alternatively, a single label is used but the reactions are donesequentially. That is, the substrate comprising the array is firstcontacted with a reaction mixture of an extension enzyme and a singletype of base with a first label, for example ddATP. The incorporation ofthe ddATP is monitored at each site on the array. The substrate is thencontacted (with optional washing steps as needed) to a second reactionmixture comprising the extension enzyme and a second nucleotide, forexample ddTTP. The reaction is then monitored; this can be repeated foreach target position.

Once each reaction has been completed and the identification of the baseat the sequencing position is ascertained, the terminating protectinggroup is removed, e.g. cleaved, leaving a free 3′ end to repeat thesequence, using an extension enzyme to add a base to the 3′ end of thesequencing primer when it is hybridized to the target sequence. As willbe appreciated by those in the art, the cleavage conditions will varywith the protecting group chosen.

In a preferred embodiment, the nucleotide analogs comprise a detectablelabel as described herein, and this may be a primary label (directlydetectable) or a secondary label (indirectly detectable).

In addition to a first nucleotide, the solution also comprises anextension enzyme, generally a DNA polymerase, as outlined above forpyrosequencing.

In a preferred embodiment, the protecting group also comprises a label.That is, as outlined in Canard et al., supra, the protecting group canserve as either a primary or secondary label, with the former beingpreferred. This is particularly preferred as the removal of the label ateach round results in less background noise, less quenching and lesscrosstalk.

In this way, reversible chain termination sequencing is accomplished.

Time-Resolved Sequencing

In a preferred embodiment, time-resolved sequencing is done. Thisembodiment relies on controlling the reaction rate of the extensionreaction and/or using a fast imaging system. Basically, the methodinvolves a simple extension reaction that is either “slowed down”, orimaged using a fast system, or both. What is important is that the rateof polymerization (extension) is significantly slower than the rate ofimage capture.

To allow for real time sequencing, parameters such as the speed of thedetector (millisecond speed is preferred), and rate of polymerizationwill be controlled such that the rate of polymerization is significantlyslower than the rate of image capture. Polymerization rates on the orderof kilo bases per minute (e.g. −10 milliseconds/nucleotide), which canbe adjusted, should allow a sufficiently wide window to find conditionswhere the sequential addition of two nucleotides can be resolved. TheDNA polymerization reaction, which has been studied intensively, caneasily be reconstituted in vitro and controlled by varying a number ofparameters including reaction temperature and the concentration ofnucleotide triphosphates.

In addition, the polymerase can be applied to the primer-templatecomplex prior to initiating the reaction. This serves to synchronize thereaction. Numerous polymerases are available. Some examples include, butare not limited to polymerases with 3′ to 5′ exonuclease activity, othernuclease activities, polymerases with different processivity, affinitiesfor modified and unmodified nucleotide triphosphates, temperatureoptima, stability, and the like.

Thus, in this embodiment, the reaction proceeds as outlined above. Thetarget sequence, comprising a first domain that will hybridize to asequencing primer and a second domain comprising a plurality of targetpositions, is attached to an array as outlined below. The sequencingprimers are added, along with an extension enzyme, as outlined herein,and dNTPs are added. Again, as outlined above, either four differentlylabeled dNTPs may be used simultaneously or, four different sequentialreactions with a single label are done. In general, the dNTPs compriseeither a primary or a secondary label, as outlined above.

In a preferred embodiment, the extension enzyme is one that isrelatively “slow”. This may be accomplished in several ways. In oneembodiment, polymerase variants are used that have a lowerpolymerization rate than wild-type enzymes. Alternatively, the reactionrate may be controlled by varying the temperature and the concentrationof dNTPs.

In a preferred embodiment, a fast (millisecond) high-sensitivity imagingsystem is used.

In one embodiment, DNA polymerization (extension) is monitored usinglight scattering, as is outlined in Johnson et al., Anal. Biochem.136(1):192 (1984), hereby expressly incorporated by reference.

Attachment of Target Sequences to Arrays

As is generally described herein, there are a variety of methods thatcan be used to attach target sequences to the solid supports of theinvention, particularly to the s that are distributed on a surface of asubstrate. Most of these methods generally rely on capture probesattached to the array. However, the attachment may be direct orindirect. Direct attachment includes those situations wherein anendogenous portion of the target sequence hybridizes to the captureprobe, or where the target sequence has been manipulated to containexogenous adapter sequences that are added to the target sequence, forexample during an amplification reaction. Indirect attachment utilizesone or more secondary probes, termed a “capture extender probe” asoutlined herein.

In a preferred embodiment, direct attachment is done, as is generallydepicted in FIG. 1A. In this embodiment, the target sequence comprises afirst target domain that hybridizes to all or part of the capture probe.

In a preferred embodiment, direct attachment is accomplished through theuse of adapters. The adapter is a chemical moiety that allows one toaddress the products of a reaction to a solid surface. The type ofreaction includes the amplification, genotyping and sequencing reactionsdisclosed herein. The adapter chemical moiety is independent of thereaction. Because the adapters are independent of the reaction, sets ofadapters can be reused to create a “universal” array that can detect avariety of products from a reaction by attaching the set of adaptersthat address to specific locations within the array to differentreactants.

Typically, the adapter and the capture probe on an array are bindingpartners, as defined herein. Although the use of other binding partnersare possible, preferred embodiments utilize nucleic acid adapters thatare non-complementary to any reactants or target sequences, but aresubstantially complementary to all or part of the capture probe on thearray.

Thus, an “adapter sequence” is a nucleic acid that is generally notnative to the target sequence, i.e. is exogenous, but is added orattached to the target sequence. it should be noted that in thiscontext, the “target sequence” can include the primary sample targetsequence, or can be a derivative target such as a reactant or product ofthe reactions outlined herein; thus for example, the target sequence canbe a PCR product, a first ligation probe or a ligated probe in an OLAreaction, etc.

As will be appreciated by those in the art, the attachment, or joining,of the adapter sequence to the target sequence can be done in a varietyof ways. In a preferred embodiment, the adapter sequences are added tothe primers of the reaction (extension primers, amplification primers,readout probes, sequencing primers, Rolling Circle primers, etc.) duringthe chemical synthesis of the primers. The adapter then gets added tothe reaction product during the reaction; for example, the primer getsextended using a polymerase to form the new target sequence that nowcontains an adapter sequence. Alternatively, the adapter sequences canbe added enzymatically. Furthermore, the adapter can be attached to thetarget after synthesis; this post-synthesis attachment could be eithercovalent or non-covalent.

In this embodiment, one or more of the amplification primers comprises afirst portion comprising the adapter sequence and a second portioncomprising the primer sequence. Extending the amplification primer as iswell known in the art results in target sequences that comprise theadapter sequences. The adapter sequences are designed to besubstantially complementary to capture probes.

In addition, as will be appreciated by those in the art, the adapter canbe attached either on the 3′ or 5′ ends, or in an internal position. Forexample, the adapter may be the detection sequence of an invasivecleavage probe. In the case of Rolling Circle probes, the adapter can becontained within the section between the probe ends. Adapters can alsobe attached to aptamers. Aptamers are nucleic acids that can be made tobind to virtually any target analyte; see Bock et al., Nature 355:564(1992); Femulok et al., Current Op. Chem. Biol. 2:230 (1998); and U.S.Pat. Nos. 5,270,163, 5,475,096, 5,567,588, 5,595,877, 5,637,459,5,683,867, 5,705,337, and related patents, hereby incorporated byreference. In addition, as outlined below, the adapter can be attachedto non-nucleic acid target analytes as well.

In one embodiment, a set of probes is hybridized to a target sequence;each probe is complementary to a different region of a single target buteach contains the same adapter. Using a poly-T bead, the mRNA target ispulled out of the sample with the probes attached. Dehybridizing theprobes attached to the target sequence and rehybridizing them to anarray containing the capture probes complementary to the adaptersequences results in binding to the array. All adapters that have boundto the same target mRNA will bind to the same location on the array.

In a preferred embodiment, indirect attachment of the target sequence tothe array is done, through the use of capture extender probes. “Captureextender” probes are generally depicted in FIG. 1C, and other figures,and have a first portion that will hybridize to all or part of thecapture probe, and a second portion that will hybridize to a firstportion of the target sequence. Two capture extender probes may also beused. This has generally been done to stabilize assay complexes forexample when the target sequence is large, or when large amplifierprobes (particularly branched or dendrimer amplifier probes) are used.

When only capture probes are utilized, it is necessary to have uniquecapture probes for each target sequence; that is, the surface must becustomized to contain unique capture probes; e.g. each bead comprises adifferent capture probe. In general, only a single type of capture probeshould be bound to a bead; however, different beads should containdifferent capture probes so that different target sequences bind todifferent beads.

Alternatively, the use of adapter sequences and capture extender probesallow the creation of more “universal” surfaces. In a preferredembodiment, an array of different and usually artificial capture probesare made; that is, the capture probes do not have complementarity toknown target sequences. The adapter sequences can then be added to anytarget sequences, or soluble capture extender probes are made; thisallows the manufacture of only one kind of array, with the user able tocustomize the array through the use of adapter sequences or captureextender probes. This then allows the generation of customized solubleprobes, which as will be appreciated by those in the art is generallysimpler and less costly.

As will be appreciated by those in the art, the length of the adaptersequences will vary, depending on the desired “strength” of binding andthe number of different adapters desired. In a preferred embodiment,adapter sequences range from about 6 to about 500 basepairs in length,with from about 8 to about 100 being preferred, and from about 10 toabout 25 being particularly preferred.

In one embodiment, microsphere arrays containing a single type ofcapture probe are made; in this embodiment, the capture extender probesare added to the beads prior to loading on the array. The captureextender probes may be additionally fixed or crosslinked, as necessary.

In a preferred embodiment, as outlined in FIG. 1 B, the capture probecomprises the sequencing primer; that is, after hybridization to thetarget sequence, it is the capture probe itself that is extended duringthe synthesis reaction.

In one embodiment, capture probes are not used, and the target sequencesare attached directly to the sites on the array. For example, librariesof clonal nucleic acids, including DNA and RNA, are used. In thisembodiment, individual nucleic acids are prepared, generally usingconventional methods (including, but not limited to, propagation inplasmid or phage vectors, amplification techniques including PCR, etc.).The nucleic acids are preferably arrayed in some format, such as a microtiter plate format, and either spotted or beads are added for attachmentof the libraries.

Attachment of the clonal libraries (or any of the nucleic acids outlinedherein) may be done in a variety of ways, as will be appreciated bythose in the art, including, but not limited to, chemical or affinitycapture (for example, including the incorporation of derivatizednucleotides such as AminoLink or biotinylated nucleotides that can thenbe used to attach the nucleic acid to a surface, as well as affinitycapture by hybridization), cross-linking, and electrostatic attachment,etc.

In a preferred embodiment, affinity capture is used to attach the clonalnucleic acids to the surface. For example, cloned nucleic acids can bederivatized, for example with one member of a binding pair, and thebeads derivatized with the other member of a binding pair. Suitablebinding pairs are as described herein for secondary labels and IBL/DBLpairs. For example, the cloned nucleic acids may be biotinylated (forexample using enzymatic incorporate of biotinylated nucleotides, for byphoto activated cross-linking of biotin). Biotinylated nucleic acids canthen be captured on streptavidin coated beads, as is known in the art.Similarly, other hapten-receptor combinations can be used, such asdigoxigenin and anti-digoxigenin antibodies. Alternatively, chemicalgroups can be added in the form of derivatized nucleotides, that canthen be used to add the nucleic acid to the surface.

Preferred attachments are covalent, although even relatively weakinteractions (i.e. non-covalent) can be sufficient to attach a nucleicacid to a surface. If there are multiple sites of attachment per eachnucleic acid. Thus, for example, electrostatic interactions can be usedfor attachment, for example by having beads carrying the opposite chargeto the bioactive agent.

Similarly, affinity capture utilizing hybridization can be used toattach cloned nucleic acids to beads. For example, as is known in theart, polyA+RNA is routinely captured by hybridization to oligo-dT beads;this may include oligo-dT capture followed by a cross-linking step, suchas psoralen crosslinking). If the nucleic acids of interest do notcontain a polyA tract, one can be attached by polymerization withterminal transferase, or via ligation of an oligoA linker, as is knownin the art.

Alternatively, chemical crosslinking may be done, for example by photoactivated crosslinking of thymidine to reactive groups, as is known inthe art.

In general, special methods are required to decode clonal arrays, as ismore fully outlined below.

Assay and Arrays

All of the above compositions and methods are directed to the detectionand/or quantification of the products of nucleic acid reactions. Thedetection systems of the present invention are based on theincorporation (or in some cases, of the deletion) of a detectable labelinto an assay complex on an array.

Accordingly, the present invention provides methods and compositionsuseful in the detection of nucleic acids. As will be appreciated bythose in the art, the compositions of the invention can take on a widevariety of configurations, as is generally outlined in the Figures. Asis more fully outlined below, preferred systems of the invention work asfollows. A target nucleic acid sequence is attached (via hybridization)to an array site. This attachment can be either directly to a captureprobe on the surface, through the use of adapters, or indirectly, usingcapture extender probes as outlined herein. In some embodiments, thetarget sequence itself comprises the labels. Alternatively, a labelprobe is then added, forming an assay complex. The attachment of thelabel probe may be direct (i.e. hybridization to a portion of the targetsequence), or indirect (i.e. hybridization to an amplifier probe thathybridizes to the target sequence), with all the required nucleic acidsforming an assay complex.

Accordingly, the present invention provides array compositionscomprising at least a first substrate with a surface comprisingindividual sites. By “array” or “biochip” herein is meant a plurality ofnucleic acids in an array format; the size of the array will depend onthe composition and end use of the array. Nucleic acids arrays are knownin the art, and can be classified in a number of ways; both orderedarrays (e.g. the ability to resolve chemistries at discrete sites), andrandom arrays are included. Ordered arrays include, but are not limitedto, those made using photolithography techniques (Affymetrix GeneChip™),spotting techniques (Synteni and others), printing techniques (HewlettPackard and Rosetta), three dimensional “gel pad” arrays, etc. Apreferred embodiment utilizes s on a variety of substrates includingfiber optic bundles, as are outlined in PCTs US98/21193, PCT US99/14387and PCT US98/05025; WO98/50782; and U.S. Ser. Nos. 09/287,573,09/151,877, 09/256,943, 09/316,154, 60/119,323, 09/315,584; all of whichare expressly incorporated by reference. While much of the discussionbelow is directed to the use of microsphere arrays on fiber opticbundles, any array format of nucleic acids on solid supports may beutilized.

Arrays containing from about 2 different bioactive agents (e.g.different beads, when beads are used) to many millions can be made, withvery large arrays being possible. Generally, the array will comprisefrom two to as many as a billion or more, depending on the size of thebeads and the substrate, as well as the end use of the array, thus veryhigh density, high density, moderate density, low density and very lowdensity arrays may be made. Preferred ranges for very high densityarrays are from about 10,000,000 to about 2,000,000,000, with from about100,000,000 to about 1,000,000,000 being preferred (all numbers being insquare cm). High density arrays range about 100,000 to about 10,000,000,with from about 1,000,000 to about 5,000,000 being particularlypreferred. Moderate density arrays range from about 10,000 to about100,000 being particularly preferred, and from about 20,000 to about50,000 being especially preferred. Low density arrays are generally lessthan 10,000, with from about 1,000 to about 5,000 being preferred. Verylow density arrays are less than 1,000, with from about 10 to about 1000being preferred, and from about 100 to about 500 being particularlypreferred. In some embodiments, the compositions of the invention maynot be in array format; that is, for some embodiments, compositionscomprising a single bioactive agent may be made as well. In addition, insome arrays, multiple substrates may be used, either of different oridentical compositions. Thus for example, large arrays may comprise aplurality of smaller substrates.

In addition, one advantage of the present compositions is thatparticularly through the use of fiber optic technology, extremely highdensity arrays can be made. Thus for example, because beads of 200 μm orless (with beads of 200 nm possible) can be used, and very small fibersare known, it is possible to have as many as 40,000 or more (in someinstances, 1 million) different elements (e.g. fibers and beads) in a 1mm² fiber optic bundle, with densities of greater than 25,000,000individual beads and fibers (again, in some. instances as many as 50-100million) per 0.5 cm² obtainable (4 million per square cm for 5μcenter-to-center and 100 million per square cm for 1μ center-to-center).

By “substrate” or “solid support” or other grammatical equivalentsherein is meant any material that can be modified to contain discreteindividual sites appropriate for the attachment or association of beadsand is amenable to at least one detection method. As will be appreciatedby those in the art, the number of possible substrates is very large.Possible substrates include, but are not limited to, glass and modifiedor functionalized glass, plastics (including acrylics, polystyrene andcopolymers of styrene and other materials, polypropylene, polyethylene,polybutylene, polyurethanes, Teflon, etc.), polysaccharides, nylon ornitrocellulose, resins, silica or silica-based materials includingsilicon, and modified silicon, carbon, metals, inorganic glasses,plastics, optical fiber bundles, and a variety of other polymers. Ingeneral, the substrates allow optical detection and do not themselvesappreciably fluoresce.

Generally the substrate is flat (planar), although as will beappreciated by those in the art, other configurations of substrates maybe used as well; for example, three dimensional configurations can beused, for example by embedding the beads in a porous block of plasticthat allows sample access to the beads and using a confocal microscopefor detection. Similarly, the beads may be placed on the inside surfaceof a tube, for flow-through sample analysis to minimize sample volume.Preferred substrates include optical fiber bundles as discussed below,and flat planar substrates such as glass, polystyrene and other plasticsand acrylics.

In a preferred embodiment, the substrate is an optical fiber bundle orarray, as is generally described in U.S. Ser. Nos. 08/944,850 and08/519,062, PCT US98/05025, and PCT US98/09163, all of which areexpressly incorporated herein by reference. Preferred embodimentsutilize preformed unitary fiber optic arrays. By “preformed unitaryfiber optic array” herein is meant an array of discrete individual fiberoptic strands that are co-axially disposed and joined along theirlengths. The fiber strands are generally individually clad. However, onething that distinguished a preformed unitary array from other fiberoptic formats is that the fibers are not individually physicallymanipulatable; that is, one strand generally cannot be physicallyseparated at any point along its length from another fiber strand.

Generally, the array of array compositions of the invention can beconfigured in several ways; see for example U.S. Ser. No. 09/473,904,hereby expressly incorporated by reference. In a preferred embodiment,as is more fully outlined below, a “one component” system is used. Thatis, a first substrate comprising a plurality of assay locations(sometimes also referred to herein as “assay wells”), such as a microtiter plate, is configured such that each assay location contains anindividual array. That is, the assay location and the array location arethe same. For example, the plastic material of the micro titer plate canbe formed to contain a plurality of “bead wells” in the bottom of eachof the assay wells. Beads containing the capture probes of the inventioncan then be loaded into the bead wells in each assay location as is morefully described below.

Alternatively, a “two component” system can be used. In this embodiment,the individual arrays are formed on a second substrate, which then canbe fitted or “dipped” into the first micro titer plate substrate. Apreferred embodiment utilizes fiber optic bundles as the individualarrays, generally with “bead wells” etched into one surface of eachindividual fiber, such that the beads containing the capture probes areloaded onto the end of the fiber optic bundle. The composite array thuscomprises a number of individual arrays that are configured to fitwithin the wells of a micro titer plate. By “composite array” or“combination array” or grammatical equivalents herein is meant aplurality of individual arrays, as outlined above. Generally the numberof individual arrays is set by the size of the micro titer plate used;thus, 96 well, 384 well and 1536 well micro titer plates utilizecomposite arrays comprising 96, 384 and 1536 individual arrays, althoughas will be appreciated by those in the art, not each micro titer wellneed contain an individual array. It should be noted that the compositearrays can comprise individual arrays that are identical, similar ordifferent. That is, in some embodiments, it may be desirable to do thesame 2,000 assays on 96 different samples; alternatively, doing 192,000experiments on the same sample (i.e. the same sample in each of the 96wells) may be desirable. Alternatively, each row or column of thecomposite array could be the same, for redundancy/quality control. Aswill be appreciated by those in the art, there are a variety of ways toconfigure the system. In addition, the random nature of the arrays maymean that the same population of beads may be added to two differentsurfaces, resulting in substantially similar but perhaps not identicalarrays.

At least one surface of the substrate is modified to contain discrete,individual sites for later association of s. These sites may comprisephysically altered sites, i.e. physical configurations such as wells orsmall depressions in the substrate that can retain the beads, such thata microsphere can rest in the well, or the use of other forces (magneticor compressive), or chemically altered or active sites, such aschemically functionalized sites, electrostatically altered sites,hydrophobically/hydrophilically functionalized sites, spots of adhesive,etc.

The sites may be a pattern, i.e. a regular design or configuration, orrandomly distributed. A preferred embodiment utilizes a regular patternof sites such that the sites may be addressed in the X-Y coordinateplane. “Pattern” in this sense includes a repeating unit cell,preferably one that allows a high density of beads on the substrate.However, it should be noted that these sites may not be discrete sites.That is, it is possible to use a uniform surface of adhesive or chemicalfunctionalities, for example, that allows the attachment of beads at anyposition. That is, the surface of the substrate is modified to allowattachment of the s at individual sites, whether or not those sites arecontiguous or non-contiguous with other sites. Thus, the surface of thesubstrate may be modified such that discrete sites are formed that canonly have a single associated bead, or alternatively, the surface of thesubstrate is modified and beads may go down anywhere, but they end up atdiscrete sites.

In a preferred embodiment, the surface of the substrate is modified tocontain wells, i.e. depressions in the surface of the substrate. Thismay be done as is generally known in the art using a variety oftechniques, including, but not limited to, photolithography, stampingtechniques, molding techniques and micro etching techniques. As will beappreciated by those in the art, the technique used will depend on thecomposition and shape of the substrate.

In a preferred embodiment, physical alterations are made in a surface ofthe substrate to produce the sites. In a preferred embodiment, thesubstrate is a fiber optic bundle and the surface of the substrate is aterminal end of the fiber bundle, as is generally described in Ser. Nos.08/818,199 and 09/151,877, both of which are hereby expresslyincorporated by reference. In this embodiment, wells are made in aterminal or distal end of a fiber optic bundle comprising individualfibers. In this embodiment, the cores of the individual fibers areetched, with respect to the cladding, such that small wells ordepressions are formed at one end of the fibers. The required depth ofthe wells will depend on the size of the beads to be added to the wells.

Generally in this embodiment, the s are non-covalently associated in thewells, although the wells may additionally be chemically functionalizedas is generally described below, cross-linking agents may be used, or aphysical barrier may be used, i.e. a film or membrane over the beads.

In a preferred embodiment, the surface of the substrate is modified tocontain chemically modified sites, that can be used to attach, eithercovalently or non-covalently, the s of the invention to the discretesites or locations on the substrate. “Chemically modified sites” in thiscontext includes, but is not limited to, the addition of a pattern ofchemical functional groups including amino groups, carboxy groups, oxogroups and thiol groups, that can be used to covalently attach s, whichgenerally also contain corresponding reactive functional groups; theaddition of a pattern of adhesive that can be used to bind the s (eitherby prior chemical functionalization for the addition of the adhesive ordirect addition of the adhesive); the addition of a pattern of chargedgroups (similar to the chemical functionalities) for the electrostaticattachment of the s, i.e. when the s comprise charged groups opposite tothe sites; the addition of a pattern of chemical functional groups thatrenders the sites differentially hydrophobic or hydrophilic, such thatthe addition of similarly hydrophobic or hydrophilic s under suitableexperimental conditions will result in association of the s to the siteson the basis of hydro affinity. For example, the use of hydrophobicsites with hydrophobic beads, in an aqueous system, drives theassociation of the beads preferentially onto the sites. As outlinedabove, “pattern” in this sense includes the use of a uniform treatmentof the surface to allow attachment of the beads at discrete sites, aswell as treatment of the surface resulting in discrete sites. As will beappreciated by those in the art, this may be accomplished in a varietyof ways.

In some embodiments, the beads are not associated with a substrate. Thatis, the beads are in solution or are not distributed on a patternedsubstrate.

In a preferred embodiment, the compositions of the invention furthercomprise a population of s. By “population” herein is meant a pluralityof beads as outlined above for arrays. Within the population areseparate subpopulations, which can be a single microsphere or multipleidentical s. That is, in some embodiments, as is more fully outlinedbelow, the array may contain only a single bead for each capture probe;preferred embodiments utilize a plurality of beads of each type.

By “microspheres” or “beads” or “particles” or grammatical equivalentsherein is meant small discrete particles. The composition of the beadswill vary, depending on the class of capture probe and the method ofsynthesis. Suitable bead compositions include those used in peptide,nucleic acid and organic moiety synthesis, including, but not limitedto, plastics, ceramics, glass, polystyrene, methylstyrene, acrylicpolymers, paramagnetic materials, thoria sol, carbon graphite, titaniumdioxide, latex or cross-linked dextrans such as Sepharose, cellulose,nylon, cross-linked micelles and Teflon may all be used. “MicrosphereDetection Guide” from Bangs Laboratories, Fishers Ind. is a helpfulguide.

The beads need not be spherical; irregular particles may be used. Inaddition, the beads may be porous, thus increasing the surface area ofthe bead available for either capture probe attachment or tagattachment. The bead sizes range from nanometers, i.e. 100 nm, tomillimeters, i.e. 1 mm, with beads from about 0.2 micron to about 200microns being preferred, and from about 0.5 to about 5 micron beingparticularly preferred, although in some embodiments smaller beads maybe used.

It should be noted that a key component of the invention is the use of asubstrate/bead pairing that allows the association or attachment of thebeads at discrete sites on the surface of the substrate, such that thebeads do not move during the course of the assay.

Each microsphere comprises a capture probe, although as will beappreciated by those in the art, there may be some s which do notcontain a capture probe, depending on the synthetic methods.

Attachment of the nucleic acids may be done in a variety of ways, aswill be appreciated by those in the art, including, but not limited to,chemical or affinity capture (for example, including the incorporationof derivatized nucleotides such as AminoLink or biotinylated nucleotidesthat can then be used to attach the nucleic acid to a surface, as wellas affinity capture by hybridization), cross-linking, and electrostaticattachment, etc. In a preferred embodiment, affinity capture is used toattach the nucleic acids to the beads. For example, nucleic acids can bederivatized, for example with one member of a binding pair, and thebeads derivatized with the other member of a binding pair. Suitablebinding pairs are as described herein for IBL/DBL pairs. For example,the nucleic acids may be biotinylated (for example using enzymaticincorporate of biotinylated nucleotides, for by photo activatedcross-linking of biotin). Biotinylated nucleic acids can then becaptured on streptavidin-coated beads, as is known in the art.Similarly, other hapten-receptor combinations can be used, such asdigoxigenin and anti-digoxigenin antibodies. Alternatively, chemicalgroups can be added in the form of derivatized nucleotides, that canthen be used to add the nucleic acid to the surface.

Preferred attachments are covalent, although even relatively weakinteractions (i.e. non-covalent) can be sufficient to attach a nucleicacid to a surface, if there are multiple sites of attachment per eachnucleic acid. Thus, for example, electrostatic interactions can be usedfor attachment, for example by having beads carrying the opposite chargeto the bioactive agent.

Similarly, affinity capture utilizing hybridization can be used toattach nucleic acids to beads. For example, as is known in the art,polyA+RNA is routinely captured by hybridization to oligo-dT beads; thismay include oligo-dT capture followed by a cross-linking step, such aspsoralen crosslinking). If the nucleic acids of interest do not containa polyA tract, one can be attached by polymerization with terminaltransferase, or via ligation of an oligoA linker, as is known in theart.

Alternatively, chemical crosslinking may be done, for example by photoactivated crosslinking of thymidine to reactive groups, as is known inthe art.

In general, probes of the present invention are designed to becomplementary to a target sequence (either the target sequence of thesample or to other probe sequences, as is described herein), such thathybridization of the target and the probes of the present inventionoccurs. This complementarily need not be perfect; there may be anynumber of base pair mismatches that will interfere with hybridizationbetween the target sequence and the single stranded nucleic acids of thepresent invention. However, if the number of mutations is so great thatno hybridization can occur under even the least stringent ofhybridization conditions, the sequence is not a complementary targetsequence. Thus, by “substantially complementary” herein is meant thatthe probes are sufficiently complementary to the target sequences tohybridize under the selected reaction conditions.

In a preferred embodiment, each bead comprises a single type of captureprobe, although a plurality of individual capture probes are preferablyattached to each bead. Similarly, preferred embodiments utilize morethan one microsphere containing a unique capture probe; that is, thereis redundancy built into the system by the use of subpopulations of s,each microsphere in the subpopulation containing the same capture probe.

As will be appreciated by those in the art, the capture probes mayeither be synthesized directly on the beads, or they may be made andthen attached after synthesis. In a preferred embodiment, linkers areused to attach the capture probes to the beads, to allow both goodattachment, sufficient flexibility to allow good interaction with thetarget molecule, and to avoid undesirable binding reactions.

In a preferred embodiment, the capture probes are synthesized directlyon the beads. As is known in the art, many classes of chemical compoundsare currently synthesized on solid supports, such as peptides, organicmoieties, and nucleic acids. It is a relatively straightforward matterto adjust the current synthetic techniques to use beads.

In a preferred embodiment, the capture probes are synthesized first, andthen covalently attached to the beads. As will be appreciated by thosein the art, this will be done depending on the composition of thecapture probes and the beads. The functionalization of solid supportsurfaces such as certain polymers with chemically reactive groups suchas thiols, amines, carboxyls, etc. is generally known in the art.Accordingly, “blank” s may be used that have surface chemistries thatfacilitate the attachment of the desired functionality by the user. Someexamples of these surface chemistries for blank s include, but are notlimited to, amino groups including aliphatic and aromatic amines,carboxylic acids, aldehydes, amides, chloromethyl groups, hydrazide,hydroxyl groups, sulfonates and sulfates.

When random arrays are used, an encoding/decoding system must be used.For example, when microsphere arrays are used, the beads are generallyput onto the substrate randomly; as such there are several ways tocorrelate the functionality on the bead with its location, including theincorporation of unique optical signatures, generally fluorescent dyes,that could be used to identify the chemical functionality on anyparticular bead. This allows the synthesis of the candidate agents (i.e.compounds such as nucleic acids and antibodies) to be divorced fromtheir placement on an array, i.e. the candidate agents may besynthesized on the beads, and then the beads are randomly distributed ona patterned surface. Since the beads are first coded with an opticalsignature, this means that the array can later be “decoded”, i.e. afterthe array is made, a correlation of the location of an individual siteon the array with the bead or candidate agent at that particular sitecan be made. This means that the beads may be randomly distributed onthe array, a fast and inexpensive process as compared to either the insitu synthesis or spotting techniques of the prior art.

However, the drawback to these methods is that for a large array, thesystem requires a large number of different optical signatures, whichmay be difficult or time-consuming to utilize. Accordingly, the presentinvention provides several improvements over these methods, generallydirected to methods of coding and decoding the arrays. That is, as willbe appreciated by those in the art, the placement of the capture probesis generally random, and thus a coding/decoding system is required toidentify the probe at each location in the array. This may be done in avariety of ways, as is more fully outlined below, and generallyincludes: a) the use a decoding binding ligand (DBL), generally directlylabeled, that binds to either the capture probe or to identifier bindingligands (IBLs) attached to the beads; b) positional decoding, forexample by either targeting the placement of beads (for example by usingphotoactivatable or photocleavable moieties to allow the selectiveaddition of beads to particular locations), or by using eithersub-bundles or selective loading of the sites, as are more fullyoutlined below; c) selective decoding, wherein only those beads thatbind to a target are decoded; or d) combinations of any of these. Insome cases, as is more fully outlined below, this decoding may occur forall the beads, or only for those that bind a particular target sequence.Similarly, this may occur either prior to or after addition of a targetsequence. In addition, as outlined herein, the target sequences detectedmay be either a primary target sequence (e.g. a patient sample), or areaction product from one of the methods described herein (e.g. anextended SBE probe, a ligated probe, a cleaved signal probe, etc.).

Once the identity (i.e. the actual agent) and location of eachmicrosphere in the array has been fixed, the array is exposed to samplescontaining the target sequences, although as outlined below, this can bedone prior to or during the analysis as well. The target sequences canhybridize (either directly or indirectly) to the capture probes as ismore fully outlined below, and results in a change in the optical signalof a particular bead.

In the present invention, “decoding” does not rely on the use of opticalsignatures, but rather on the use of decoding binding ligands that areadded during a decoding step. The decoding binding ligands will bindeither to a distinct identifier binding ligand partner that is placed onthe beads, or to the capture probe itself. The decoding binding ligandsare either directly or indirectly labeled, and thus decoding occurs bydetecting the presence of the label. By using pools of decoding bindingligands in a sequential fashion, it is possible to greatly minimize thenumber of required decoding steps.

In some embodiments, the s may additionally comprise identifier bindingligands for use in certain decoding systems. By “identifier bindingligands” or “IBLs” herein is meant a compound that will specificallybind a corresponding decoder binding ligand (DBL) to facilitate theelucidation of the identity of the capture probe attached to the bead.That is, the IBL and the corresponding DBL form a binding partner pair.By “specifically bind” herein is meant that the IBL binds its DBL withspecificity sufficient to differentiate between the corresponding DBLand other DBLs (that is, DBLs for other IBLs), or other components orcontaminants of the system. The binding should be sufficient to remainbound under the conditions of the decoding step, including wash steps toremove non-specific binding. In some embodiments, for example when theIBLs and corresponding DBLs are proteins or nucleic acids, thedissociation constants of the IBL to its DBL will be less than about10˜−10 M″, with less than about 10′5 to 10′e M″ being preferred and lessthan about 10″−10 M″ being particularly preferred.

IBL-DBL binding pairs are known or can be readily found using knowntechniques. For example, when the IBL is a protein, the DBLs includeproteins (particularly including antibodies or fragments thereof (FAbs,etc.)) or small molecules, or vice versa (the IBL is an antibody and theDBL is a protein). Metal ion-metal ion ligands or chelators pairs arealso useful. Antigen-antibody pairs, enzymes and substrates orinhibitors, other protein-protein interacting pairs, receptor-ligands,complementary nucleic acids, and carbohydrates and their bindingpartners are also suitable binding pairs. Nucleic acid-nucleic acidbinding proteins pairs are also useful. Similarly, as is generallydescribed in U.S. Pat. Nos. 5,270,163, 5,475,096, 5,567,588, 5,595,877,5,637,459, 5,683,867, 5,705,337, and related patents, herebyincorporated by reference, nucleic acid “aptamers” can be developed forbinding to virtually any target; such an aptamer-target pair can be usedas the IBL-DBL pair. Similarly, there is a wide body of literaturerelating to the development of binding pairs based on combinatorialchemistry methods.

In a preferred embodiment, the IBL is a molecule whose color orluminescence properties change in the presence of a selectively-bindingDBL. For example, the IBL may be a fluorescent pH indicator whoseemission intensity changes with pH. Similarly, the IBL may be afluorescent ion indicator, whose emission properties change with ionconcentration.

Alternatively, the IBL is a molecule whose color or luminescenceproperties change in the presence of various solvents. For example, theIBL may be a fluorescent molecule such as an ethidium salt whosefluorescence intensity increases in hydrophobic environments. Similarly,the IBL may be a derivative of fluorescein whose color changes betweenaqueous and nonpolar solvents.

In one embodiment, the DBL may be attached to a bead, i.e. a “decoderbead”, that may carry a label such as a fluorophore.

In a preferred embodiment, the IBL-DBL pair comprise substantiallycomplementary single-stranded nucleic acids. In this embodiment, thebinding ligands can be referred to as “identifier probes” and “decoderprobes”. Generally, the identifier and decoder probes range from about 4basepairs in length to about 1000, with from about 6 to about 100 beingpreferred, and from about 8 to about 40 being particularly preferred.What is important is that the probes are long enough to be specific,i.e. to distinguish between different IBL-DBL pairs, yet short enough toallow both a) dissociation, if necessary, under suitable experimentalconditions, and b) efficient hybridization.

In a preferred embodiment, as is more fully outlined below, the IBLs donot bind to DBLs. Rather, the IBLs are used as identifier moieties(“IMs”) that are identified directly, for example through the use ofmass spectroscopy.

Alternatively, in a preferred embodiment, the IBL and the capture probeare the same moiety; thus, for example, as outlined herein, particularlywhen no optical signatures are used, the capture probe can serve as boththe identifier and the agent. For example, in the case of nucleic acids,the bead-bound probe (which serves as the capture probe) can also binddecoder probes, to identify the sequence of the probe on the bead. Thus,in this embodiment, the DBLs bind to the capture probes.

In a preferred embodiment, the s may contain an optical signature. Thatis, as outlined in U.S. Ser. Nos. 08/818,199 and 09/151,877, previouswork had each subpopulation of s comprising a unique optical signatureor optical tag that is used to identify the unique capture probe of thatsubpopulation of microspheres; that is, decoding utilizes opticalproperties of the beads such that a bead comprising the unique opticalsignature may be distinguished from beads at other locations withdifferent optical signatures. Thus the previous work assigned eachcapture probe a unique optical signature such that any s comprising thatcapture probe are identifiable on the basis of the signature. Theseoptical signatures comprised dyes, usually chromophores or fluorophores,that were entrapped or attached to the beads themselves. Diversity ofoptical signatures utilized different fluorochromes, different ratios ofmixtures of fluorochromes, and different concentrations (intensities) offluorochromes.

In a preferred embodiment, the present invention does not rely solely onthe use of optical properties to decode the arrays. However, as will beappreciated by those in the art, it is possible in some embodiments toutilize optical signatures as an additional coding method, inconjunction with the present system. Thus, for example, as is more fullyoutlined below, the size of the array may be effectively increased whileusing a single set of decoding moieties in several ways, one of which isthe use of optical signatures one some beads. Thus, for example, usingone “set” of decoding molecules, the use of two populations of beads,one with an optical signature and one without, allows the effectivedoubling of the array size. The use of multiple optical signaturessimilarly increases the possible size of the array.

In a preferred embodiment, each subpopulation of beads comprises aplurality of different IBLs. By using a plurality of different IBLs toencode each capture probe, the number of possible unique codes issubstantially increased. That is, by using one unique IBL per captureprobe, the size of the array will be the number of unique IBLs (assumingno “reuse” occurs, as outlined below). However, by using a plurality ofdifferent IBLs per bead, n, the size of the array can be increased to 2,when the presence or absence of each IBL is used as the indicator. Forexample, the assignment of 10 IBLs per bead generates a 10 bit binarycode, where each bit can be designated as “1” (IBL is present) or “0”(IBL is absent). A 10 bit binary code has 210 possible variants However,as is more fully discussed below, the size of the array may be furtherincreased if another parameter is included such as concentration orintensity; thus for example, if two different concentrations of the IBLare used, then the array size increases as 3. Thus, in this embodiment,each individual capture probe in the array is assigned a combination ofIBLs, which can be added to the beads prior to the addition of thecapture probe, after, or during the synthesis of the capture probe, i.e.simultaneous addition of IBLs and capture probe components.

Alternatively, the combination of different IBLs can be used toelucidate the sequence of the nucleic acid. Thus, for example, using twodifferent IBLs (IBL1 and IBL2), the first position of a nucleic acid canbe elucidated: for example, adenosine can be represented by the presenceof both IBL1 and IBL2; thymidine can be represented by the presence ofIBL1 but not IBL2, cytosine can be represented by the presence of IBL2but not IBL1, and guanosine can be represented by the absence of both.The second position of the nucleic acid can be done in a similar mannerusing IBL3 and IBL4; thus, the presence of IBL1, IBL2, IBL3 and IBL4gives a sequence of AA; IBL1, IBL2, and IBL3 shows the sequence AT;IBL1, IBL3 and IBL4 gives the sequence TA, etc. The third positionutilizes IBL5 and IBL6, etc. In this way, the use of 20 differentidentifiers can yield a unique code for every possible 10-mer.

In this way, a sort of “bar code” for each sequence can be constructed;the presence or absence of each distinct IBL will allow theidentification of each capture probe.

In addition, the use of different concentrations or densities of IBLsallows a “reuse” of sorts. if, for example, the bead comprising a firstagent has a 1× concentration of IBL, and a second bead comprising asecond agent has a 10× concentration of IBL, using saturatingconcentrations of the corresponding labelled DBL allows the user todistinguish between the two beads.

Once the s comprising the capture probes are generated, they are addedto the substrate to form an array. It should be noted that while most ofthe methods described herein add the beads to the substrate prior to theassay, the order of making, using and decoding the array can vary. Forexample, the array can be made, decoded, and then the assay done.Alternatively, the array can be made, used in an assay, and thendecoded; this may find particular use when only a few beads need bedecoded. Alternatively, the beads can be added to the assay mixture,i.e. the sample containing the target sequences, prior to the additionof the beads to the substrate; after addition and assay, the array maybe decoded. This is particularly preferred when the sample comprisingthe beads is agitated or mixed; this can increase the amount of targetsequence bound to the beads per unit time, and thus (in the case ofnucleic acid assays) increase the hybridization kinetics. This may findparticular use in cases where the concentration of target sequence inthe sample is low; generally, for low concentrations, long binding timesmust be used.

In general, the methods of making the arrays and of decoding the arraysis done to maximize the number of different candidate agents that can beuniquely encoded, The compositions of the invention may be made in avariety of ways. In general, the arrays are made by adding a solution orslurry comprising the beads to a surface containing the sites forattachment of the beads. This may be done in a variety of buffers,including aqueous and organic solvents, and mixtures. The solvent canevaporate, and excess beads are removed.

In a preferred embodiment, when non-covalent methods are used toassociate the beads with the array; a novel method of loading the beadsonto the array is used. This method comprises exposing the array to asolution of particles (including s and cells) and then applying energy,e.g. agitating or vibrating the mixture. This results in an arraycomprising more tightly associated particles, as the agitation is donewith sufficient energy to cause weakly-associated beads to fall off (orout, in the case of wells). These sites are then available to bind adifferent bead. In this way, beads that exhibit a high affinity for thesites are selected. Arrays made in this way have two main advantages ascompared to a more static loading: first of all, a higher percentage ofthe sites can be filled easily, and secondly, the arrays thus loadedshow a substantial decrease in bead loss during assays. Thus, in apreferred embodiment, these methods are used to generate arrays thathave at least about 50% of the sites filled, with at least about 75%being preferred, and at least about 90% being particularly preferred.Similarly, arrays generated in this manner preferably lose less thanabout 20% of the beads during an assay, with less than about 10% beingpreferred and less than about 5% being particularly preferred.

In this embodiment, the substrate comprising the surface with thediscrete sites is immersed into a solution comprising the particles(beads, cells, etc.). The surface may comprise wells, as is describedherein, or other types of sites on a patterned surface such that thereis a differential affinity for the sites. This differential affinityresults in a competitive process, such that particles that willassociate more tightly are selected. Preferably, the entire surface tobe ‘9oaded’ with beads is in fluid contact with the solution. Thissolution is generally a slurry ranging from about 10,000:1 beads:solution (vol:vol) to 1:1. Generally, the solution can comprise anynumber of reagents, including aqueous buffers, organic solvents, salts,other reagent components, etc. In addition, the solution preferablycomprises an excess of beads; that is, there are more beads than siteson the array. Preferred embodiments utilize two-fold to billion-foldexcess of beads.

The immersion can mimic the assay conditions; for example, if the arrayis to be “dipped” from above into a micro titer plate comprisingsamples, this configuration can be repeated for the loading, thusminimizing the beads that are likely to fall out due to gravity.

Once the surface has been immersed, the substrate, the solution, or bothare subjected to a competitive process, whereby the particles with loweraffinity can be disassociated from the substrate and replaced byparticles exhibiting a higher affinity to the site. This competitiveprocess is done by the introduction of energy, in the form of heat,sonication, stirring or mixing, vibrating or agitating the solution orsubstrate, or both.

A preferred embodiment utilizes agitation or vibration. In general, theamount of manipulation of the substrate is minimized to prevent damageto the array; thus, preferred embodiments utilize the agitation of thesolution rather than the array, although either will work. As will beappreciated by those in the art, this agitation can take on any numberof forms, with a preferred embodiment utilizing micro titer platescomprising bead solutions being agitated using micro titer plateshakers.

The agitation proceeds for a period of time sufficient to load the arrayto a desired fill. Depending on the size and concentration of the beadsand the size of the array, this time may range from about 1 second todays, with from about 1 minute to about 24 hours being preferred.

It should be noted that not all sites of an array may comprise a bead;that is, there may be some sites on the substrate surface which areempty. In addition, there may be some sites that contain more than onebead, although this is not preferred.

In some embodiments, for example when chemical attachment is done, it ispossible to attach the beads in a non-random or ordered way. Forexample, using photoactivatable attachment linkers or photoactivatableadhesives or masks, selected sites on the array may be sequentiallyrendered suitable for attachment, such that defined populations of beadsare laid down.

The arrays of the present invention are constructed such thatinformation about the identity of the capture probe is built into thearray, such that the random deposition of the beads in the fiber wellscan be “decoded” to allow identification of the capture probe at allpositions. This may be done in a variety of ways, and either before,during or after the use of the array to detect target molecules.

Thus, after the array is made, it is “decoded” in order to identify thelocation of one or more of the capture probes, i.e. each subpopulationof beads, on the substrate surface.

In a preferred embodiment, pyrosequencing techniques are used to decodethe array, as is generally described in “Nucleic Acid Sequencing UsingMicrosphere Arrays”, filed Oct. 22, 1999 (no U.S. Ser. No. receivedyet), hereby expressly incorporated by reference.

In a preferred embodiment, a selective decoding system is used. In thiscase, only those s exhibiting a change in the optical signal as a resultof the binding of a target sequence are decoded. This is commonly donewhen the number of “hits”, i.e. the number of sites to decode, isgenerally low. That is, the array is first scanned under experimentalconditions in the absence of the target sequences. The sample containingthe target sequences is added, and only those locations exhibiting achange in the optical signal are decoded. For example, the beads ateither the positive or negative signal locations may be eitherselectively tagged or released from the array (for example through theuse of photocleavable linkers), and subsequently sorted or enriched in afluorescence-activated cell sorter (FACS). That is, either all thenegative beads are released, and then the positive beads are eitherreleased or analyzed in situ, or alternatively all the positives arereleased and analyzed. Alternatively, the labels may comprisehalogenated aromatic compounds, and detection of the label is done usingfor example gas chromatography, chemical tags, isotopic tags massspectral tags.

As will be appreciated by those in the art, this may also be done insystems where the array is not decoded; i.e. there need not ever be acorrelation of bead composition with location. In this embodiment, thebeads are loaded on the array, and the assay is run. The “positives”,i.e. those beads displaying a change in the optical signal as is morefully outlined below, are then “marked” to distinguish or separate themfrom the “negative” beads. This can be done in several ways, preferablyusing fiber optic arrays. In a preferred embodiment, each bead containsa fluorescent dye. After the assay and the identification of the“positives” or “active beads”, light is shown down either only thepositive fibers or only the negative fibers, generally in the presenceof a light-activated reagent (typically dissolved oxygen). In the formercase, all the active beads are photo bleached. Thus, upon non-selectiverelease of all the beads with subsequent sorting, for example using afluorescence activated cell sorter (FACS) machine, the non-fluorescentactive beads can be sorted from the fluorescent negative beads.Alternatively, when light is shown down the negative fibers, all thenegatives are non-fluorescent and the positives are fluorescent, andsorting can proceed. The characterization of the attached capture probemay be done directly, for example using mass spectroscopy.

Alternatively, the identification may occur through the use ofidentifier moieties (“IMs”), which are similar to IBLs but need notnecessarily bind to DBLs. That is, rather than elucidate the structureof the capture probe directly, the composition of the IMs may serve asthe identifier. Thus, for example, a specific combination of IMs canserve to code the bead, and be used to identify the agent on the beadupon release from the bead followed by subsequent analysis, for exampleusing a gas chromatograph or mass spectroscope.

Alternatively, rather than having each bead contain a fluorescent dye,each bead comprises a non-fluorescent precursor to a fluorescent dye.For example, using photocleavable protecting groups, such as certainortho-nitrobenzyl groups, on a fluorescent molecule, photo activation ofthe fluorochrome can be done. After the assay, light is shown down againeither the “positive” or the “negative” fibers, to distinguish thesepopulations. The illuminated precursors are then chemically converted toa fluorescent dye. All the beads are then released from the array, withsorting, to form populations of fluorescent and non-fluorescent beads(either the positives and the negatives or vice versa).

In an alternate preferred embodiment, the sites of attachment of thebeads (for example the wells) include a photopolymerizable reagent, orthe photopolymerizable agent is added to the assembled array. After thetest assay is run, light is shown down again either the “positive” orthe “negative” fibers, to distinguish these populations. As a result ofthe irradiation, either all the positives or all the negatives arepolymerized and trapped or bound to the sites, while the otherpopulation of beads can be released from the array.

In a preferred embodiment, the location of every capture probe isdetermined using decoder binding ligands (DBLs). As outlined above, DBLsare binding ligands that will either bind to identifier binding ligands,if present, or to the capture probes themselves, preferably when thecapture probe is a nucleic acid or protein.

In a preferred embodiment, as outlined above, the DBL binds to the IBL.

In a preferred embodiment, the capture probes are single-strandednucleic acids and the DBL is a substantially complementarysingle-stranded nucleic acid that binds (hybridizes) to the captureprobe, termed a decoder probe herein. A decoder probe that issubstantially complementary to each candidate probe is made and used todecode the array. In this embodiment, the candidate probes and thedecoder probes should be of sufficient length (and the decoding step rununder suitable conditions) to allow specificity; i.e. each candidateprobe binds to its corresponding decoder probe with sufficientspecificity to allow the distinction of each candidate probe.

In a preferred embodiment, the DBLs are either directly or indirectlylabeled. In a preferred embodiment, the DBL is directly labeled, thatis, the DBL comprises a label. In an alternate embodiment, the DBL isindirectly labeled; that is, a labeling binding ligand (LBL) that willbind to the DBL is used. In this embodiment, the labeling bindingligand-DBL pair can be as described above for IBL-DBL pairs.

Accordingly, the identification of the location of the individual beads(or subpopulations of beads) is done using one or more decoding stepscomprising a binding between the labeled DBL and either the IBL or thecapture probe (i.e. a hybridization between the candidate probe and thedecoder probe when the capture probe is a nucleic acid). After decoding,the DBLs can be removed and the array can be used; however, in somecircumstances, for example when the DBL binds to an IBL and not to thecapture probe, the removal of the DBL is not required (although it maybe desirable in some circumstances). In addition, as outlined herein,decoding may be done either before the array is used to in an assay,during the assay, or after the assay.

In one embodiment, a single decoding step is done. In this embodiment,each DBL is labeled with a unique label, such that the number of uniquetags is equal to or greater than the number of capture probes (althoughin some cases, “reuse” of the unique labels can be done, as describedherein; similarly, minor variants of candidate probes can share the samedecoder, if the variants are encoded in another dimension, i.e. in thebead size or label). For each capture probe or IBL, a DBL is made thatwill specifically bind to it and contains a unique tag, for example oneor more fluorochromes. Thus, the identity of each DBL, both itscomposition (i.e. its sequence when it is a nucleic acid) and its label,is known. Then, by adding the DBLs to the array containing the captureprobes Under conditions which allow the formation of complexes (termedhybridization complexes when the components are nucleic acids) betweenthe DBLs and either the capture probes or the IBLs, the location of eachDBL can be elucidated. This allows the identification of the location ofeach capture probe; the random array has been decoded. The DBLs can thenbe removed, if necessary, and the target sample applied.

In a preferred embodiment, the number of unique labels is less than thenumber of unique capture probes, and thus a sequential series ofdecoding steps are used. In this embodiment, decoder probes are dividedinto n sets for decoding. The number of sets corresponds to the numberof unique tags. Each decoder probe is labeled in n separate reactionswith n distinct tags. All the decoder probes share the same n tags. Thedecoder probes are pooled so that each pool contains only one of the ntag versions of each decoder, and no two decoder probes have the samesequence of tags across all the pools. The number of pools required forthis to be true is determined by the number of decoder probes and the n.Hybridization of each pool to the array generates a signal at everyaddress. The sequential hybridization of each pool in turn will generatea unique, sequence-specific code for each candidate probe. Thisidentifies the candidate probe at each address in the array. Forexample, if four tags are used, then 4×n sequential hybridizations canideally distinguish 4″ sequences, although in some cases more steps maybe required. After the hybridization of each pool, the hybrids aredenatured and the decoder probes removed, so that the probes arerendered single-stranded for the next hybridization (although it is alsopossible to hybridize limiting amounts of target so that the availableprobe is not saturated. Sequential hybridizations can be carried out andanalyzed by subtracting pre-existing signal from the previoushybridization).

An example is illustrative. Assuming an array of 16 probe nucleic acids(numbers 1-16), and four unique tags (four different fluors, forexample; labels A-D). Decoder probes 1-16 are made that correspond tothe probes on the beads. The first step is to label decoder probes 1-4with tag A, decoder probes 5-8 with tag B, decoder probes 9-12 with tagC, and decoder probes 13-16 with tag D. The probes are mixed and thepool is contacted with the array containing the beads with the attachedcandidate probes. The location of each tag (and thus each decoder andcandidate probe pair) is then determined. The first set of decoderprobes are then removed. A second set is added, but this time, decoderprobes 1, 5, 9 and 13 are labeled with tag A, decoder probes 2, 6, 10and 14 are labeled with tag B, decoder probes 3, 7, 11 and 15 arelabeled with tag C, and decoder probes 4, 8, 12 and 16 are labeled withtag D. Thus, those beads that contained tag A in both decoding stepscontain candidate probe 1; tag A in the first decoding step and tag B inthe second decoding step contain candidate probe 2; tag A in the firstdecoding step and tag C in the second step contain candidate probe 3;etc. In one embodiment, the decoder probes are labeled in situ; that is,they need not be labeled prior to the decoding reaction. In thisembodiment, the incoming decoder probe is shorter than the candidateprobe, creating a 5′ “overhang” on the decoding probe. The addition oflabeled ddNTPs (each labeled with a unique tag) and a polymerase willallow the addition of the tags in a sequence specific manner, thuscreating a sequence-specific pattern of signals. Similarly, othermodifications can be done, including ligation, etc.

In addition, since the size of the array will be set by the number ofunique decoding binding ligands, it is possible to “reuse” a set ofunique DBLs to allow for a greater number of test sites. This may bedone in several ways; for example, by using some subpopulations thatcomprise optical signatures. Similarly, the use of a positional codingscheme within an array; different sub-bundles may reuse the set of DBLs.Similarly, one embodiment utilizes bead size as a coding modality, thusallowing the reuse of the set of unique DBLs for each bead size.Alternatively, sequential partial loading of arrays with beads can alsoallow the reuse of DBLs. Furthermore, “code sharing” can occur as well.

In a preferred embodiment, the DBLs may be reused by having somesubpopulations of beads comprise optical signatures. In a preferredembodiment, the optical signature is generally a mixture of reporterdyes, preferably fluorescent. By varying both the composition of themixture (i.e. the ratio of one dye to another) and the concentration ofthe dye (leading to differences in signal intensity), matrices of uniqueoptical signatures maybe generated. This may be done by covalentlyattaching the dyes to the surface of the beads, or alternatively, byentrapping the dye within the bead.

In a preferred embodiment, the encoding can be accomplished in a ratioof at least two dyes, although more encoding dimensions may be added inthe size of the beads, for example. In addition, the labels aredistinguishable from one another; thus two different labels may comprisedifferent molecules (i.e. two different fluors) or, alternatively, onelabel at two different concentrations or intensity.

In a preferred embodiment, the dyes are covalently attached to thesurface of the beads. This may be done as is generally outlined for theattachment of the capture probes, using functional groups on the surfaceof the beads. As will be appreciated by those in the art, theseattachments are done to minimize the effect on the dye.

In a preferred embodiment, the dyes are non-covalently associated withthe beads, generally by entrapping the dyes in the pores of the beads.

Additionally, encoding in the ratios of the two or more dyes, ratherthan single dye concentrations, is preferred since it providesinsensitivity to the intensity of light used to interrogate the reporterdye's signature and detector sensitivity.

In a preferred embodiment, a spatial or positional coding system isdone. In this embodiment, there are sub-bundles or subarrays (i.e.portions of the total array) that are utilized. By analogy with thetelephone system, each subarray is an “area code”, that can have thesame tags (i.e. telephone numbers) of other subarrays, that areseparated by virtue of the location of the subarray. Thus, for example,the same unique tags can be reused from bundle to bundle. Thus, the useof 50 unique tags in combination with 100 different subarrays can forman array of 5000 different capture probes. In this embodiment, itbecomes important to be able to identify one bundle from another; ingeneral, this is done either manually or through the use of markerbeads, i.e. beads containing unique tags for each sub array.

In alternative embodiments, additional encoding parameters can be added,such as microsphere size. For example, the use of different size beadsmay also allow the reuse of sets of DBLs; that is, it is possible to uses of different sizes to expand the encoding dimensions of the s. Opticalfiber arrays can be fabricated containing pixels with different fiberdiameters or cross-sections; alternatively, two or more fiber opticbundles, each with different cross-sections of the individual fibers,can be added together to form a larger bundle; or, fiber optic bundleswith fiber of the same size cross-sections can be used, but just withdifferent sized beads. With different diameters, the largest wells canbe filled with the largest s and then moving onto progressively smallers in the smaller wells until all size wells are then filled. In thismanner, the same dye ratio could be used to encode s of different sizesthereby expanding the number of different oligonucleotide sequences orchemical functionalities present in the array. Although outlined forfiber optic substrates, this as well as the other methods outlinedherein can be used with other substrates and with other attachmentmodalities as well.

In a preferred embodiment, the coding and decoding is accomplished bysequential loading of the s into the array. As outlined above forspatial coding, in this embodiment, the optical signatures can be“reused”. In this embodiment, the library of s each comprising adifferent capture probe (or the subpopulations each comprise a differentcapture probe), is divided into a plurality of sublibraries; forexample, depending on the size of the desired array and the number ofunique tags, 10 sublibraries each comprising roughly 10% of the totallibrary may be made, with each sublibrary comprising roughly the sameunique tags. Then, the first sublibrary is added to the fiber opticbundle comprising the wells, and the location of each capture probe isdetermined, generally through the use of DBLs. The second sublibrary isthen added, and the location of each capture probe is again determined.The signal in this case will comprise the signal from the “first” DBLand the “second” DBL; by comparing the two matrices the location of eachbead in each sublibrary can be determined. Similarly, adding the third,fourth, etc. sublibraries sequentially will allow the array to befilled.

In a preferred embodiment, codes can be “shared” in several ways. In afirst embodiment, a single code (i.e. IBL/DBL pair) can be assigned totwo or more agents if the target sequences different sufficiently intheir binding strengths. For example, two nucleic acid probes used in anmRNA quantitation assay can share the same code if the ranges of theirhybridization signal intensities do not overlap. This can occur, forexample, when one of the target sequences is always present at a muchhigher concentration than the other. Alternatively, the two targetsequences might always be present at a similar concentration, but differin hybridization efficiency.

Alternatively, a single code can be assigned to multiple agents if theagents are functionally equivalent. For example, if a set ofoligonucleotide probes are designed with the common purpose of detectingthe presence of a particular gene, then the probes are functionallyequivalent, even though they may differ in sequence. Similarly, an arrayof this type could be used to detect homologs of known genes. In thisembodiment, each gene is represented by a heterologous set of probes,hybridizing to different regions of the gene (and therefore differing insequence). The set of probes share a common code. If a homolog ispresent, it might hybridize to some but not all of the probes. The levelof homology might be indicated by the fraction of probes hybridizing, aswell as the average hybridization intensity. Similarly, multipleantibodies to the same protein could all share the same code.

In a preferred embodiment decoding of self-assembled random arrays isdone on the bases of pH titration. In this embodiment, in addition tocapture probes, the beads comprise optical signatures, wherein theoptical signatures are generated by the use of pH-responsive dyes(sometimes referred to herein as “pH dyes”) such as fluorophores. Thisembodiment is similar to that outlined in PCT US98/05025 and U.S. Ser.No. 09/151,877, both of which are expressly incorporated by reference,except that the dyes used in the present invention exhibits changes influorescence intensity (or other properties) when the solution pH isadjusted from below the pKa to above the pKa (or vice versa). In apreferred embodiment, a set of pH dyes are used, each with a differentpKa, preferably separated by at least 0.5 pH units. Preferredembodiments utilize a pH dye set of pKa's of 2.0, 2.5, 3.0, 3.5, 4.0,4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10.0, 10.5, 11,and 11.5. Each bead can contain any subset of the pH dyes, and in thisway a unique code for the capture probe is generated. Thus, the decodingof an array is achieved by titrating the array from pH 1 to pH 13, andmeasuring the fluorescence signal from each bead as a function ofsolution pH.

Thus, the present invention provides array compositions comprising asubstrate with a surface comprising discrete sites. A population of s isdistributed on the sites, and the population comprises at least a firstand a second subpopulation. Each subpopulation comprises a captureprobe, and, in addition, at least one optical dye with a given pKa. ThepKas of the different optical dyes are different.

In a preferred embodiment, “random” decoding probes can be made. Bysequential hybridizations or the use of multiple labels, as is outlinedabove, a unique hybridization pattern can be generated for each sensorelement. This allows all the beads representing a given clone to beidentified as belonging to the same group. In general, this is done byusing random or partially degenerate decoding probes, that bind in asequence-dependent but not highly sequence-specific manner. The processcan be repeated a number of times, each time using a different labelingentity, to generate a different pattern of signals based onquasi-specific interactions. In this way, a unique optical signature iseventually built up for each sensor element. By applying patternrecognition or clustering algorithms to the optical signatures, thebeads can be grouped into sets that share the same signature (i.e. carrythe same probes).

In order to identify the actual sequence of the clone itself, additionalprocedures are required; for example, direct sequencing can be done, oran ordered array containing the clones, such as a spotted cDNA array, togenerate a “key” that links a hybridization pattern to a specific clone.

Alternatively, clone arrays can be decoded using binary decoding withvector tags. For example, partially randomized oligos are cloned into anucleic acid vector (e.g. plasmid, phage, etc.). Each oligonucleotidesequence consists of a subset of a limited set of sequences. Forexample, if the limited set comprises 10 sequences, each oligonucleotidemay have some subset (or all of the 10) sequences. Thus each of the 10sequences can be present or absent in the oligonucleotide. Therefore,there are 270 or 1,024 possible combinations. The sequences may overlap,and minor variants can also be represented (e.g. A, C, T and Gsubstitutions) to increase the number of possible combinations. Anucleic acid library is cloned into a vector containing the random codesequences. Alternatively, other methods such as PCR can be used to addthe tags. In this way it is possible to use a small number of oligodecoding probes to decode an array of clones.

As will be appreciated by those in the art, the systems of the inventionmay take on a large number of different configurations, as is generallydepicted in the Figures. In general, there are three types of systemsthat can be used: (1) “non-sandwich” systems (also referred to herein as“direct” detection) in which the target sequence itself is labeled withdetectable labels (again, either because the primers comprise labels ordue to the incorporation of labels into the newly synthesized strand);(2) systems in which label probes directly bind to the target analytes;and (3) systems in which label probes are indirectly bound to the targetsequences, for example through the use of amplifier probes.

Detection of the reactions of the invention, including the directdetection of products and indirect detection utilizing label probes(i.e. sandwich assays), is preferably done by detecting assay complexescomprising detectable labels, which can be attached to the assay complexin a variety of ways, as is more fully described below.

Once the target sequence has preferably been anchored to the array, anamplifier probe is hybridized to the target sequence, either directly,or through the use of one or more label extender probes, which serves toallow “generic” amplifier probes to be made. As for all the stepsoutlined herein, this may be done simultaneously with capturing, orsequentially. Preferably, the amplifier probe contains a multiplicity ofamplification sequences, although in some embodiments, as describedbelow, the amplifier probe may contain only a single amplificationsequence, or at least two amplification sequences. The amplifier probemay take on a number of different forms; either a branched conformation,a dendrimer conformation, or a linear “string” of amplificationsequences. Label probes comprising detectable labels (preferably but notrequired to be fluorophores) then hybridize to the amplificationsequences (or in some cases the label probes hybridize directly to thetarget sequence), and the labels detected, as is more fully outlinedbelow.

Accordingly, the present invention provides compositions comprising anamplifier probe. By “amplifier probe” or “nucleic acid multimer” or“amplification multimer” or grammatical equivalents herein is meant anucleic acid probe that is used to facilitate signal amplification.Amplifier probes comprise at least a first single-stranded nucleic acidprobe sequence, as defined below, and at least one single-strandednucleic acid amplification sequence, with a multiplicity ofamplification sequences being preferred.

Amplifier probes comprise a first probe sequence that is used, eitherdirectly or indirectly, to hybridize to the target sequence. That is,the amplifier probe itself may have a first probe sequence that issubstantially complementary to the target sequence, or it has a firstprobe sequence that is substantially complementary to a portion of anadditional probe, in this case called a label extender probe, that has afirst portion that is substantially complementary to the targetsequence. In a preferred embodiment, the first probe sequence of theamplifier probe is substantially complementary to the target sequence.

In general, as for all the probes herein, the first probe sequence is ofa length sufficient to give specificity and stability. Thus generally,the probe sequences of the invention that are designed to hybridize toanother nucleic acid (i.e. probe sequences, amplification sequences,portions or domains of larger probes) are at least about 5 nucleosideslong, with at least about 10 being preferred and at least about 15 beingespecially preferred.

In a preferred embodiment, several different amplifier probes are used,each with first probe sequences that will hybridize to a differentportion of the target sequence. That is, there is more than one level ofamplification; the amplifier probe provides an amplification of signaldue to a multiplicity of labelling events, and several differentamplifier probes, each with this multiplicity of labels, for each targetsequence is used. Thus, preferred embodiments utilize at least twodifferent pools of amplifier probes, each pool having a different probesequence for hybridization to different portions of the target sequence;the only real limitation on the number of different amplifier probeswill be the length of the original target sequence. In addition, it isalso possible that the different amplifier probes contain differentamplification sequences, although this is generally not preferred.

In a preferred embodiment, the amplifier probe does not hybridize to thesample target sequence directly, but instead hybridizes to a firstportion of a label extender probe. This is particularly useful to allowthe use of “generic” amplifier probes, that is, amplifier probes thatcan be used with a variety of different targets. This may be desirablesince several of the amplifier probes require special synthesistechniques. Thus, the addition of a relatively short probe as a labelextender probe is preferred. Thus, the first probe sequence of theamplifier probe is substantially complementary to a first portion ordomain of a first label extender single-stranded nucleic acid probe. Thelabel extender probe also contains a second portion or domain that issubstantially complementary to a portion of the target sequence. Both ofthese portions are preferably at least about 10 to about 50 nucleotidesin length, with a range of about 15 to about 30 being preferred. Theterms “first” and “second” are not meant to confer an orientation of thesequences with respect to the 5′-3′ orientation of the target or probesequences. For example, assuming a 5′-3′ orientation of thecomplementary target sequence, the first portion may be located either5′ to the second portion, or 3 to the second portion. For convenienceherein, the order of probe sequences are generally shown from left toright.

In a preferred embodiment, more than one label extender probe-amplifierprobe pair may be used, that is, n is more than 1. That is, a pluralityof label extender probes may be used, each with a portion that issubstantially complementary to a different portion of the targetsequence; this can serve as another level of amplification. Thus, apreferred embodiment utilizes pools of at least two label extenderprobes, with the upper limit being set by the length of the targetsequence.

In a preferred embodiment, more than one label extender probe is usedwith a single amplifier probe to reduce non-specific binding, as isgenerally outlined in U.S. Pat. No. 5,681,697, incorporated by referenceherein. In this embodiment, a first portion of the first label extenderprobe hybridizes to a first portion of the target sequence, and thesecond portion of the first label extender probe hybridizes to a firstprobe sequence of the amplifier probe. A first portion of the secondlabel extender probe hybridizes to a second portion of the targetsequence, and the second portion of the second label extender probehybridizes to a second probe sequence of the amplifier probe. These formstructures sometimes referred to as “cruciform” structures orconfigurations, and are generally done to confer stability when largebranched or dendrimeric amplifier probes are used.

In addition, as will be appreciated by those in the art, the labelextender probes may interact with a preamplifier probe, described below,rather than the amplifier probe directly.

Similarly, as outlined above, a preferred embodiment utilizes severaldifferent amplifier probes, each with first probe sequences that willhybridize to a different portion of the label extender probe. Inaddition, as outlined above, it is also possible that the differentamplifier probes contain different amplification sequences, althoughthis is generally not preferred.

In addition to the first probe sequence, the amplifier probe alsocomprises at least one amplification sequence. An “amplificationsequence” or “amplification segment” or grammatical equivalents hereinis meant a sequence that is used, either directly or indirectly, to bindto a first portion of a label probe as is more fully described below.Preferably, the amplifier probe comprises a multiplicity ofamplification sequences, with from about 3 to about 1000 beingpreferred, from about 10 to about 100 being particularly preferred, andabout 50 being especially preferred. In some cases, for example whenlinear amplifier probes are used, from 1 to about 20 is preferred withfrom about 5 to about 10 being particularly preferred.

The amplification sequences may be linked to each other in a variety ofways, as will be appreciated by those in the art. They may be covalentlylinked directly to each other, or to intervening sequences or chemicalmoieties, through nucleic acid linkages such as phosphodiester bonds,PNA bonds, etc., or through interposed linking agents such amino acid,carbohydrate or polyol bridges, or through other cross-linking agents orbinding partners. The site(s) of linkage may be at the ends of asegment, and/or at one or more internal nucleotides in the strand. In apreferred embodiment, the amplification sequences are attached vianucleic acid linkages.

In a preferred embodiment, branched amplifier probes are used, as aregenerally described in U.S. Pat. No. 5,124,246, hereby incorporated byreference. Branched amplifier probes may take on “fork-like” or“comb-like” conformations. “Fork-like” branched amplifier probesgenerally have three or more oligonucleotide segments emanating from apoint of origin to form a branched structure. The point of origin may beanother nucleotide segment or a multifunctional molecule to which atleast three segments can be covalently or tightly bound. “Comb-like”branched amplifier probes have a linear backbone with a multiplicity ofsidechain oligonucleotides extending from the backbone. In eitherconformation, the pendant segments will normally depend from a modifiednucleotide or other organic moiety having the appropriate functionalgroups for attachment of oligonucleotides. Furthermore, in eitherconformation, a large number of amplification sequences are availablefor binding, either directly or indirectly, to detection probes. Ingeneral, these structures are made as is known in the art, usingmodified multifunctional nucleotides, as is described in U.S. Pat. Nos.5,635,352 and 5,124,246, among others.

In a preferred embodiment, dendrimer amplifier probes are used, as aregenerally described in U.S. Pat. No. 5,175,270, hereby expresslyincorporated by reference. Dendrimeric amplifier probes haveamplification sequences that are attached via hybridization, and thushave portions of double-stranded nucleic acid as a component of theirstructure. The outer surface of the dendrimer amplifier probe has amultiplicity of amplification sequences.

In a preferred embodiment, linear amplifier probes are used, that haveindividual amplification sequences linked end-to-end, either directly orwith short intervening sequences to form a polymer. As with the otheramplifier configurations, there may be additional sequences or moietiesbetween the amplification sequences. In one embodiment, the linearamplifier probe has a single amplification sequence.

In addition, the amplifier probe may be totally linear, totallybranched, totally dendrimeric, or any combination thereof.

The amplification sequences of the amplifier probe are used, eitherdirectly or indirectly, to bind to a label probe to allow detection. Ina preferred embodiment, the amplification sequences of the amplifierprobe are substantially complementary to a first portion of a labelprobe. Alternatively, amplifier extender probes are used, that have afirst portion that binds to the amplification sequence and a secondportion that binds to the first portion of the label probe.

In addition, the compositions of the invention may include“preamplifier” molecules, which serves a bridging moiety between thelabel extender molecules and the amplifier probes. In this way, moreamplifier and thus more labels are ultimately bound to the detectionprobes. Preamplifier molecules may be either linear or branched, andtypically contain in the range of about 30-3000 nucleotides.

Thus, label probes are either substantially complementary to anamplification sequence or to a portion of the target sequence.

Detection of the nucleic acid reactions of the invention, including thedirect detection of genotyping products and indirect detection utilizinglabel probes (i.e. sandwich assays), is done by detecting assaycomplexes comprising labels.

In a preferred embodiment, several levels of redundancy are built intothe arrays of the invention. Building redundancy into an array givesseveral significant advantages, including the ability to makequantitative estimates of confidence about the data and significantincreases in sensitivity. Thus, preferred embodiments utilize arrayredundancy. As will be appreciated by those in the art, there are atleast two types of redundancy that can be built into an array: the useof multiple identical sensor elements (termed herein “sensorredundancy”), and the use of multiple sensor elements directed to thesame target analyte, but comprising different chemical functionalities(termed herein “target redundancy”). For example, for the detection ofnucleic acids, sensor redundancy utilizes of a plurality of sensorelements such as beads comprising identical binding ligands such asprobes. Target redundancy utilizes sensor elements with different probesto the same target: one probe may span the first 25 bases of the target,a second probe may span the second 25 bases of the target, etc. Bybuilding in either or both of these types of redundancy into an array,significant benefits are obtained.

For example, a variety of statistical mathematical analyses may be done.

In addition, while this is generally described herein for bead arrays,as will be appreciated by those in the art, this techniques can be usedfor any type of arrays designed to detect target analytes. Furthermore,while these techniques are generally described for nucleic acid systems,these techniques are useful in the detection of other bindingligand/target analyte systems as well.

In a preferred embodiment, sensor redundancy is used. In thisembodiment, a plurality of sensor elements, e.g. beads, comprisingidentical bioactive agents are used. That is, each subpopulationcomprises a plurality of beads comprising identical bioactive agents(e.g. binding ligands). By using a number of identical sensor elementsfor a given array, the optical signal from each sensor element can becombined and any number of statistical analyses run, as outlined below.This can be done for a variety of reasons. For example, in time varyingmeasurements, redundancy can significantly reduce the noise in thesystem. For non-time based measurements, redundancy can significantlyincrease the confidence of the data.

In a preferred embodiment, a plurality of identical sensor elements areused. As will be appreciated by those in the art, the number ofidentical sensor elements will vary with the application and use of thesensor array. In general, anywhere from 2 to thousands may be used, withfrom 2 to 100 being preferred, 2 to 50 being particularly preferred andfrom 5 to 20 being especially preferred. In general, preliminary resultsindicate that roughly 10 beads gives a sufficient advantage, althoughfor some applications, more identical sensor elements can be used.

Once obtained, the optical response signals from a plurality of sensorbeads within each bead subpopulation can be manipulated and analyzed ina wide variety of ways, including baseline adjustment, averaging,standard deviation analysis, distribution and cluster analysis,confidence interval analysis, mean testing, etc.

In a preferred embodiment, the first manipulation of the opticalresponse signals is an optional baseline adjustment. In a typicalprocedure, the standardized optical responses are adjusted to start at avalue of 0.0 by subtracting the integer 1.0 from all data points. Doingthis allows the baseline-loop data to remain at zero even when summedtogether and the random response signal noise is canceled out. When thesample is a fluid, the fluid pulse-loop temporal region, however,frequently exhibits a characteristic change in response, eitherpositive, negative or neutral, prior to the sample pulse and oftenrequires a baseline adjustment to overcome noise associated with driftin the first few data points due to charge buildup in the CCD camera. Ifno drift is present, typically the baseline from the first data pointfor each bead sensor is subtracted from all the response data for thesame bead. If drift is observed, the average baseline from the first tendata points for each bead sensor is subtracted from the all the responsedata for the same bead. By applying this baseline adjustment, whenmultiple bead responses are added together they can be amplified whilethe baseline remains at zero. Since all beads respond at the same timeto the sample (e.g. the sample pulse), they all see the pulse at theexact same time and there is no registering or adjusting needed foroverlaying their responses. In addition, other types of baselineadjustment may be done, depending on the requirements and output of thesystem used.

Once the baseline has been adjusted, a number of possible statisticalanalyses may be run to generate known statistical parameters. Analysesbased on redundancy are known and generally described in texts such asFreund and Walpole, Mathematical Statistics, Prentice Hall, Inc. NewJersey, 1980, hereby incorporated by reference in its entirety.

In a preferred embodiment, signal summing is done by simply adding theintensity values of all responses at each time point, generating a newtemporal response comprised of the sum of all bead responses. Thesevalues can be baseline-adjusted or raw. As for all the analysesdescribed herein, signal summing can be performed in real time or duringpost-data acquisition data reduction and analysis. In one embodiment,signal summing is performed with a commercial spreadsheet program(Excel, Microsoft, Redmond, Wash.) after optical response data iscollected.

In a preferred embodiment, cumulative response data is generated bysimply adding all data points in successive time intervals. This finalcolumn, comprised of the sum of all data points at a particular timeinterval, may then be compared or plotted with the individual beadresponses to determine the extent of signal enhancement or improvedsignal-to-noise ratios.

In a preferred embodiment, the mean of the subpopulation (i.e. theplurality of identical beads) is determined, using the well-knownEquation 1:

$\begin{matrix}{\mu = {\Sigma \frac{X_{i}n}{\;}}} & {{Equation}\mspace{14mu} 1}\end{matrix}$

In some embodiments, the subpopulation may be redefined to exclude somebeads if necessary (for example for obvious outliers, as discussedbelow).

In a preferred embodiment, the standard deviation of the subpopulationcan be determined, generally using Equation 2 (for the entiresubpopulation) and Equation 3 (for less than the entire subpopulation):

$\begin{matrix}{\sigma = \sqrt{\frac{{\Sigma \left( {x_{i} - \mu} \right)}^{2}}{n}}} & {{Equation}\mspace{14mu} 2} \\{s = \sqrt{\frac{{\Sigma \left( {x_{i} - \overset{\_}{x}} \right)}^{2}}{n - 1}}} & {{Equation}\mspace{14mu} 3}\end{matrix}$

As for the mean, the subpopulation may be redefined to exclude somebeads if necessary (for example for obvious outliers, as discussedbelow).

In a preferred embodiment, statistical analyses are done to evaluatewhether a particular data point has statistical validity within asubpopulation by using techniques including, but not limited to, tdistribution and cluster analysis. This may be done to statisticallydiscard outliers that may otherwise skew the result and increase thesignal-to-noise ratio of any particular experiment. This may be doneusing Equation 4:

$\begin{matrix}{t = \frac{\overset{\_}{x} - \mu}{s/\sqrt{n}}} & {{Equation}\mspace{14mu} 4}\end{matrix}$

In a preferred embodiment, the quality of the data is evaluated usingconfidence intervals, as is known in the art. Confidence intervals canbe used to facilitate more comprehensive data processing to measure thestatistical validity of a result.

In a preferred embodiment, statistical parameters of a subpopulation ofbeads are used to do hypothesis testing. One application is testsconcerning means, also called mean testing. In this application,statistical evaluation is done to determine whether two subpopulationsare different. For example, one sample could be compared with anothersample for each subpopulation within an array to determine if thevariation is statistically significant.

In addition, mean testing can also be used to differentiate twodifferent assays that share the same code. If the two assays giveresults that are statistically distinct from each other, then thesubpopulations that share a common code can be distinguished from eachother on the basis of the assay and the mean test, shown below inEquation 5:

$\begin{matrix}{z = \frac{\overset{\_}{x_{1}} - \overset{\_}{x_{2}}}{\sqrt{\frac{\sigma_{1}^{2}}{n_{1}} + \frac{\sigma_{2}^{2}}{n_{2}}}}} & {{Equation}\mspace{14mu} 5}\end{matrix}$

Furthermore, analyzing the distribution of individual members of asubpopulation of sensor elements may be done. For example, asubpopulation distribution can be evaluated to determine whether thedistribution is binomial, Poisson, hypergeometric, etc.

In addition to the sensor redundancy, a preferred embodiment utilizes aplurality of sensor elements that are directed to a single targetanalyte but yet are not identical. For example, a single target nucleicacid analyte may have two or more sensor elements each comprising adifferent probe. This adds a level of confidence as non-specific bindinginteractions can be statistically minimized. When nucleic acid targetanalytes are to be evaluated, the redundant nucleic acid probes may beoverlapping, adjacent, or spatially separated. However, it is preferredthat two probes do not compete for a single binding site, so adjacent orseparated probes are preferred. Similarly, when proteinaceous targetanalytes are to be evaluated, preferred embodiments utilize bioactiveagent binding agents that bind to different parts of the target. Forexample, when antibodies (or antibody fragments) are used as bioactiveagents for the binding of target proteins, preferred embodiments utilizeantibodies to different epitopes.

In this embodiment, a plurality of different sensor elements may beused, with from about 2 to about 20 being preferred, and from about 2 toabout 10 being especially preferred, and from 2 to about 5 beingparticularly preferred, including 2, 3, 4 or 5. However, as above, moremay also be used, depending on the application.

As above, any number of statistical analyses may be run on the data fromtarget redundant sensors.

One benefit of the sensor element summing (referred to herein as “beadsumming” when beads are used), is the increase in sensitivity that canoccur.

In addition, the present invention is directed to the use of adaptersequences to assemble arrays comprising target analytes. Includingnon-nucleic acid target analytes. By “target analyte” or “analyte” orgrammatical equivalents herein is meant any molecule, compound orparticle to be detected. As outlined below, target analytes preferablybind to binding ligands, as is more fully described below. As will beappreciated by those in the art, a large number of analytes may bedetected using the present methods; basically, any target analyte forwhich a binding ligand, described below, may be made may be detectedusing the methods of the invention.

Suitable analytes include organic and inorganic molecules, includingbiomolecules. In a preferred embodiment, the analyte may be anenvironmental pollutant (including pesticides, insecticides, toxins,etc.); a chemical (including solvents, polymers, organic materials,etc.): therapeutic molecules (including therapeutic and abused drugs,antibiotics, etc.); biomolecules (including hormones, cytokines,proteins, lipids, carbohydrates, cellular membrane antigens andreceptors (neural, hormonal, nutrient, and cell surface receptors) ortheir ligands, etc); whole cells (including procaryotic (such aspathogenic bacteria) and eukaryotic cells, including mammalian tumorcells); viruses (including retroviruses, herpesviruses, adenoviruses,lentiviruses, etc.); and spores; etc. Particularly preferred analytesare environmental pollutants; nucleic acids; proteins (includingenzymes, antibodies, antigens, growth factors, cytokines, etc);therapeutic and abused drugs; cells; and viruses.

In a preferred embodiment, the target analyte is a protein. As will beappreciated by those in the art, there are a large number of possibleproteinaceous target analytes that may be detected using the presentinvention. By “proteins” or grammatical equivalents herein is meantproteins, oligopeptides and peptides, derivatives and analogs, includingproteins containing non-naturally occurring amino acids and amino acidanalogs, and peptidomimetic structures. The side chains may be in eitherthe (R) or the (S) configuration. In a preferred embodiment, the aminoacids are in the (S) or L-configuration. As discussed below, when theprotein is used as a binding ligand, it may be desirable to utilizeprotein analogs to retard degradation by sample contaminants.

Suitable protein target analytes include, but are not limited to, (1)immunoglobulins, particularly IgEs, IgGs and IgMs, and particularlytherapeutically or diagnostically relevant antibodies, including but notlimited to, for example, antibodies to human albumin, apolipoproteins(including apolipoprotein E), human chorionic gonadotropin, cortisol,a-fetoprotein, thyroxin, thyroid stimulating hormone (TSH),antithrombin, antibodies to pharmaceuticals (including antiepilepticdrugs (phenytoin, primidone, carbariezepin, ethosuximide, valproic acid,and phenobarbitol), cardioactive drugs (digoxin, lidocaine,procainamide, and disopyramide), bronchodilators (theophylline),antibiotics (chloramphenicol, sulfonamides), antidepressants,immunosuppressants, abused drugs (amphetamine, methamphetamine,cannabinoids, cocaine and opiates) and antibodies to any number ofviruses (including orthomyxoviruses, (e.g. influenza virus),paramyxoviruses (e.g respiratory syncytial virus, mumps virus, measlesvirus), adenoviruses, rhinoviruses, coronaviruses, reoviruses,togaviruses (e.g. rubella virus), parvoviruses, poxviruses (e.g. variolavirus, vaccinia virus), enteroviruses (e.g. poliovirus, coxsackievirus),hepatitis viruses (including A, B and C), herpesviruses (e.g. Herpessimplex virus, varicella-zoster virus, cytomegalovirus, Epstein-Barrvirus), rotaviruses, Norwalk viruses, hantavirus, arenavirus,rhabdovirus (e.g. rabies virus), retroviruses (including HIV, HTLV-I and-II), papovaviruses (e.g. papillomavirus), polyomaviruses, andpicomaviruses, and the like), and bacteria (including a wide variety ofpathogenic and non-pathogenic prokaryotes of interest includingBacillus; Vibrio, e.g. V. cholerae; Escherichia, e.g. Enterotoxigenic E.coli, Shigella, e.g. S. dysenteriae; Salmonella, e.g. S. typhi;Mycobacterium e.g. M. tuberculosis, M. leprae; Clostridium, e.g. C.botulinum, C. tetani, C. difficile, C. perfringens; Cornyebacterium,e.g. C. diphtheriae; Streptococcus, S. pyogenes, S. pneumoniae;Staphylococcus, e.g. S. aureus; Haemophilus, e.g. H. influenzae;Neisseria, e.g. N. meningitidis, N. gonorrhoeae; Yersinia, e.g. G.lamblia Y. pestis, Pseudomonas, e.g. P. aeruginosa, P. putida;Chlamydia, e.g. C. trachomatis; Bordetella, e.g. B. pertussis;Treponema, e.g. T. palladium; and the like); (2) enzymes (and otherproteins), including but not limited to, enzymes used as indicators ofor treatment for heart disease, including creatine kinase, lactatedehydrogenase, aspartate amino transferase, troponin T, myoglobin,fibrinogen, cholesterol, triglycerides, thrombin, tissue plasminogenactivator (tPA); pancreatic disease indicators including amylase,lipase, chymotrypsin and trypsin; liver function enzymes and proteinsincluding cholinesterase, bilirubin, and alkaline phosphatase; aldolase,prostatic acid phosphatase, terminal deoxynucleotidyl transferase, andbacterial and viral enzymes such as HIV protease; (3) hormones andcytokines (many of which serve as ligands for cellular receptors) suchas erythropoietin (EPO), thrombopoietin (TPO), the interleukins(including IL-1 through IL-17), insulin, insulin-like growth factors(including IGF-1 and -2), epidermal growth factor (EGF), transforminggrowth factors (including TGF-α and TGF-β), human growth hormone,transferrin, epidermal growth factor (EGF), low density lipoprotein,high density lipoprotein, leptin, VEGF, PDGF, ciliary neurotrophicfactor, prolactin, adrenocorticotropic hormone (ACTH), calcitonin, humanchorionic gonadotropin, cortisol, estradiol, follicle stimulatinghormone (FSH), thyroid-stimulating hormone (TSH), luteinizing hormone(LH), progesterone, testosterone; and (4) other proteins (includinga-fetoprotein, carcinoembryonic antigen CEA.

In addition, any of the biomolecules for which antibodies may bedetected may be detected directly as well; that is, detection of virusor bacterial cells, therapeutic and abused drugs, etc., may be donedirectly.

Suitable target analytes include carbohydrates, including but notlimited to, markers for breast cancer (CA15-3, CA 549, CA 27.29),mucin-like carcinoma associated antigen (MCA), ovarian cancer (CA125),pancreatic cancer (DE-PAN-2), and colorectal and pancreatic cancer (CA19, CA 50, CA242).

The adapter sequences may be chosen as outlined above. These adaptersequences can then be added to the target analytes using a variety oftechniques. In general, as described above, noncovalent attachment usingbinding partner pairs may be done, or covalent attachment using chemicalmoieties (including linkers).

Once the adapter sequences are associated with the target analyte,including target nucleic acids, the compositions are added to an array.In one embodiment a plurality of hybrid adapter sequence/target analytesare pooled prior to addition to an array. All of the methods andcompositions herein are drawn to compositions and methods for detectingthe presence of target analytes, particularly nucleic acids, usingadapter arrays.

Advantages of using adapters include but are not limited to, forexample, the ability to create universal arrays. That is, a single arrayis utilized with each capture probe designed to hybridize with aspecific adapter. The adapters are joined to any number of targetanalytes, such as nucleic acids, as is described herein. Thus, the samearray is used for vastly different target analytes. Furthermore,hybridization of adapters with capture probes results in non-covalentattachment of the target nucleic acid to the microsphere. As such, thetarget nucleic/adapter hybrid is easily removed, and themicrosphere/capture probe can be reused. In addition, the constructionof kits is greatly facilitated by the use of adapters. For example,arrays or microspheres can be prepared that comprise the capture probe;the adapters can be packaged along with the s for attachment to anytarget analyte of interest. Thus, one need only attach the adapter tothe target analyte and disperse on the array for the construction of anarray of target analytes.

Once made, the compositions of the invention find use in a number ofapplications. In a preferred embodiment, the compositions are used toprobe a sample solution for the presence or absence of a targetsequence, including the quantification of the amount of target sequencepresent.

For SNP analysis, the ratio of different labels at a particular locationon the array indicates the homozygosity or heterozygosity of the targetsample, assuming the same concentration of each readout probe is used.Thus, for example, assuming a first readout probe comprising a firstbase at the readout position with a first detectable label and a secondreadout probe comprising a second base at the readout position with asecond detectable label, equal signals (roughly 1:1 (taking into accountthe different signal intensities of the different labels, differenthybridization efficiencies, and other reasons)) of the first and secondlabels indicates a heterozygote. The absence of a signal from the firstlabel (or a ratio of approximately 0:1) indicates a homozygote of thesecond detection base; the absence of a signal from the second label (ora ratio of approximately 1:0) indicates a homozygote for the firstdetection base. As is appreciated by those in the art, the actual ratiosfor any particular system are generally determined empirically. Theratios also allow for SNP quantitation.

The present invention also finds use as a methodology for the detectionof mutations or mismatches in target nucleic acid sequences. Forexample, recent focus has been on the analysis of the relationshipbetween genetic variation and phenotype by making use of polymorphic DNAmarkers. Previous work utilized short tandem repeats (STRs) aspolymorphic positional markers; however, recent focus is on the use ofsingle nucleotide polymorphisms (SNPs), which occur at an averagefrequency of more than 1 per kilobase in human genomic DNA. Some SNPs,particularly those in and around coding sequences, are likely to be thedirect cause of therapeutically relevant phenotypic variants. There area number of well known polymorphisms that cause clinically importantphenotypes; for example, the apoE2/3/4 variants are associated withdifferent relative risk of Alzheimer's and other diseases (see Cordor etal., Science 261(1993). Multiplex PCR amplification of SNP loci withsubsequent hybridization to oligonucleotide arrays has been shown to bean accurate and reliable method of simultaneously genotyping at leasthundreds of SNPs; see Wang et al., Science, 280:1077 (1998): see alsoSchafer et al., Nature Biotechnology 16:33-39 (1998). The compositionsof the present invention may easily be substituted for the arrays of theprior art.

Generally, a sample containing a target analyte (whether for detectionof the target analyte or screening for binding partners of the targetanalyte) is added to the array, under conditions suitable for binding ofthe target analyte to at least one of the capture probes, i.e. generallyphysiological conditions. The presence or absence of the target analyteis then detected. As will be appreciated by those in the art, this maybe done in a variety of ways, generally through the use of a change inan optical signal. This change can occur via many different mechanisms.A few examples include the binding of a dye-tagged analyte to the bead,the production of a dye species on or near the beads, the destruction ofan existing dye species, a change in the optical signature upon analyteinteraction with dye on bead, or any other optical interrogatable event.

In a preferred embodiment, the change in optical signal occurs as aresult of the binding of a target analyte that is labeled, eitherdirectly or indirectly, with a detectable label, preferably an opticallabel such as a fluorochrome. Thus, for example, when a proteinaceoustarget analyte is used, it may be either directly labeled with a fluor,or indirectly, for example through the use of a labeled antibody.Similarly, nucleic acids are easily labeled with fluorochromes, forexample during PCR amplification as is known in the art. Alternatively,upon binding of the target sequences, a hybridization indicator may beused as the label. Hybridization indicators preferentially associatewith double stranded nucleic acid, usually reversibly. Hybridizationindicators include intercalators and minor and/or major groove bindingmoieties. In a preferred embodiment, intercalators may be used; sinceintercalation generally only occurs in the presence of double strandednucleic acid, only in the presence of target hybridization will thelabel light up. Thus, upon binding of the target analyte to a captureprobe, there is a new optical signal generated at that site, which thenmay be detected.

Alternatively, in some cases, as discussed above, the target analytesuch as an enzyme generates a species that is either directly orindirectly optical detectable.

Furthermore, in some embodiments, a change in the optical signature maybe the basis of the optical signal. For example, the interaction of somechemical target analytes with some fluorescent dyes on the beads mayalter the optical signature, thus generating a different optical signal.

As will be appreciated by those in the art, in some embodiments, thepresence or absence of the target analyte may be done using changes inother optical or non-optical signals, including, but not limited to,surface enhanced Raman spectroscopy, surface plasmon resonance,radioactivity, etc.

The assays may be run under a variety of experimental conditions, aswill be appreciated by those in the art. A variety of other reagents maybe included in the screening assays. These include reagents like salts,neutral proteins, e.g. albumin, detergents, etc which may be used tofacilitate optimal protein-protein binding and/or reduce non-specific orbackground interactions. Also reagents that otherwise improve theefficiency of the assay, such as protease inhibitors, nucleaseinhibitors, anti-microbial agents, etc., may be used. The mixture ofcomponents may be added in any order that provides for the requisitebinding. Various blocking and washing steps may be utilized as is knownin the art.

In addition, the present invention provides kits for the reactions ofthe invention, comprising components of the assays as outlined herein.In addition, a variety of other reagents may be included in the assaysor the kits. These include reagents like salts, neutral proteins, e.g.albumin, detergents, etc which may be used to facilitate optimalprotein-protein binding and/or reduce non-specific or backgroundinteractions. Also reagents that otherwise improve the efficiency of theassay, such as protease inhibitors, nuclease inhibitors, anti-microbialagents, etc., may be used. The mixture of components may be added in anyorder that provides for the requisite activity.

All references cited herein are incorporated by reference in theirentirety.

1.-34. (canceled)
 35. A system for detecting a plurality of targetnucleic acid sequences, comprising: a) an array of capture probesattached to discrete sites on a solid support adapted to hybridize toadapter sequences of modified primers or their complements; b) asolution comprising modified primers, wherein the solution is in contactwith the array and is prepared by a method comprising: i) hybridizing aplurality of different first primers to first portions of a plurality oftarget sequences, wherein each of said different first primers comprisesan adapter sequence exogenous to said target sequences, ii) hybridizinga plurality of different second primers to second portions of saidplurality of target sequences, thereby forming a plurality ofhybridization complexes, iii) extending said first or said secondprimers, and ligating said first and second primers together to form aplurality of different modified primers, and iv) removing unextendedfirst or second primers from said modified primers; c) a polymeraseadapted to extend said capture probes; and d) a detector adapted todetect extended capture probes hybridized to said modified primers. 36.The system of claim 35, wherein the modified primers are hybridized tothe capture probes.
 37. The system of claim 36, wherein the captureprobes have been extended by the polymerase.
 38. The system of claim 36,wherein the capture probes have been amplified by the polymerase. 39.The system of claim 35, wherein the method of preparing the solutioncomprises amplifying the plurality of different modified primers. 40.The system of claim 39, wherein the amplified modified primers arehybridized to the capture probes.
 41. The system of claim 39, whereinthe amplifying the plurality of different modified primers compriseshybridizing said plurality of different modified primers with aplurality of amplifier probes complementary to said plurality of firstand second primers and amplifying said different modified primers. 42.The system of claim 35, wherein the plurality of target sequences areimmobilized to one or more beads.
 43. The system of claim 35, whereinthe plurality of target nucleic acid sequences comprisesdeoxyribonucleic acids.
 44. The system of claim 35, wherein the removingthe unextended first or second primers comprises removing the unextendedfirst or second primers affinity chromatography.
 45. The system of claim35, wherein the detector is adapted to identify a nucleotide at adetection position for each of said target nucleic acid sequences,wherein a primer of said plurality of first primers or said plurality ofsecond primers is complementary to said detection position.
 46. Thesystem of claim 35, wherein said array comprises a population of beadscomprising said capture probes.
 47. The system of claim 46, wherein saidbeads are associated with individual sites of said solid support. 48.The system of claim 47, wherein each of said sites is configured to havea single associated bead.
 49. The system of claim 35, wherein thedetector is adapted to detect a label attached to said modified captureprobes.
 50. The system of claim 49, wherein the label comprises afluorescent label.
 51. The system of claim 35, wherein said targetsequences comprise loci having a single nucleotide polymorphism (SNP)allele.
 52. The system of claim 51, wherein said plurality of said firstprimers comprise allele specific primers and said plurality of secondprimers comprise locus specific primers.
 53. The system of claim 52,wherein a terminal base of said allele specific primers correspond tosaid SNP allele.